Wilfredo Torres Calderon

I am a Post Doctoral Researcher at the University of Illinois at Urbana-Champaign, where I conduct research in computer vision, multimodal vision-language models (VLMs), and video understanding. I am advised by Derek Hoiem.

I completed my Ph.D. in Civil Engineering at UIUC, where I was advised by Mani Golparvar-Fard. Before that, I completed an MSc in Civil Engineering at UIUC, and simultaneously with my Ph.D. studies, I pursued an MCS in Computer Science at UIUC. I earned my bachelor's degree at the Pontificia Universidad Católica del Perú.

Email / CV / Google Scholar

Research

My research spans computer vision, multimodal learning, and continual learning, with emphasis on systems that operate over long temporal horizons. I have contributed to the development of next-generation vision-language models through:

Designing models for long-term spatiotemporal reasoning and open-vocabulary video understanding.
Developing object- and part-level representations for explainable visual reasoning in dynamic scenes.
Building continual learning frameworks that allow models to accumulate knowledge over time without retraining from scratch.
Advancing multimodal architectures capable of integrating vision, language, and long-video context.
Working on DARPA-sponsored research targeting robust perception and adaptive learning in real-world environments.

My interdisciplinary background in Civil Engineering and Computer Science informs my approach to modeling physical environments, structured geometry, and dynamic processes, enabling the creation of robust and interpretable vision systems.

Affiliations


University of Illinois 2024-present	Reconstruct Inc. 2022-2024	Autodesk Research 2020-2022	University of Illinois 2015-2022	Pontificia Universidad Catolica del Peru 2003-2008

Publications

	Region Representations Revisited Michal Shlapentokh-Rothman, Ansel Blume, Yao Xiao, Yuqun Wu, Sethurame TV, Heyi Tao, Jae Yong Lee, Wilfredo Torres, Yu-Xiong Wang, Derek Hoiem CVPR, 2024 Project Page / Arxiv / Video Region representations used to be popular in the pre-deep learning era. What happens when we create region representations with recently released foundation models? We show that our simple method achieves impressive performance on existing tasks such as semantic segmentation as well as new one.
	Synthesizing pose sequences from 3D assets for video-based Activity Analysis Wilfredo Torres Calderon, Dominic Roberts, Mani Golparvar-Fard Journal of Computing in Civil Engineering, 2021 A vision-based activity analysis method that leverages synthetically generated hauling operations using 3D simulations.
	Vision-based construction worker activity analysis informed by body posture Dominic Roberts, Wilfredo Torres Calderon, Shuai Tang, Mani Golparvar-Fard Journal of Computing in Civil Engineering, 2020 A vision-based activity analysis method that leverages 2D pose estimation outputs that are used in many state-of-the-art construction worker ergonomics analysis methods.
	An Annotation Tool for Benchmarking Methods for Automated Construction Worker Pose Estimation and Activity Analysis Dominic Roberts, Mingzhu Wang, Wilfredo Torres Calderon, Mani Golparvar-Fard International Conference on Smart Infrastructure and Construction (ICSIC), 2019 A 2D human pose annotation tool adapted from CVAT that can also annotate per-frame activity labels.
	Automated Mining of Construction Schedules for Easy and Quick Assembly of 4D BIM Simulations Wilfredo Torres Calderon, Yumo Chi, Fouad Amer, Mani Golparvar-Fard International Conference on Computing in Civil Engineering (i3CE), 2019 An NLP model for automated mapping of raw construction activities to 3D BIM elements.

Teaching

CEE320 Construction Engineering and Management,
Position: Graduate Teaching Assistant,
Period: Fall 2016, Fall 2017, Spring 2018, & Fall 2018.
Professor: Mani Golparvar-Fard

CEE598 Visual Sensing for Civil Infrastructure Engineering and Management,
Position: Graduate Teaching Assistant,
Period: Spring 2017
Professor: Mani Golparvar-Fard

CEE598 Building Information Modeling (BIM),
Position: Graduate Teaching Assistant,
Period: Spring 2019.
Professor: Mani Golparvar-Fard

CS543 Computer Vision,
Position: Graduate Teaching Assistant,
Period: Spring 2019.
Professor: Svetlana Lazebnik

CS445 Computational Photography,
Position: Graduate Teaching Assistant,
Period: Fall 2019 & Spring 2020.
Professor: Derek Hoiem

Credit to Jon Barron .