|
Research
My research spans computer vision, multimodal learning, and continual learning, with emphasis on systems that operate
over long temporal horizons. I have contributed to the development of next-generation vision-language models through:
- Designing models for long-term spatiotemporal reasoning and open-vocabulary video understanding.
- Developing object- and part-level representations for explainable visual reasoning in dynamic scenes.
- Building continual learning frameworks that allow models to accumulate knowledge over time without retraining from scratch.
- Advancing multimodal architectures capable of integrating vision, language, and long-video context.
- Working on DARPA-sponsored research targeting robust perception and adaptive learning in real-world environments.
My interdisciplinary background in Civil Engineering and Computer Science informs my approach to modeling physical environments,
structured geometry, and dynamic processes, enabling the creation of robust and interpretable vision systems.
|
   
|
   
|
   
|
   
|
|
University of Illinois 2024-present |
Reconstruct Inc. 2022-2024 |
Autodesk Research 2020-2022 |
University of Illinois 2015-2022 |
Pontificia Universidad Catolica del Peru 2003-2008 |
|
|
Region Representations Revisited
Michal Shlapentokh-Rothman*, Ansel Blume*, Yao Xiao, Yuqun Wu, Sethurame TV, Heyi Tao, Jae Yong Lee, Wilfredo Torres, Yu-Xiong Wang, Derek Hoiem
CVPR, 2024
Project Page /
Arxiv /
Video
Region representations used to be popular in the pre-deep learning era. What happens when we create region representations with recently released foundation models? We show that our simple method achieves impressive performance on existing tasks such as semantic segmentation as well as new one.
|
|
|
Synthesizing pose sequences from 3D assets for video-based Activity Analysis
Wilfredo Torres Calderon,
Dominic Roberts,
Mani Golparvar-Fard
Journal of Computing in Civil Engineering, 2021
A vision-based activity analysis method that leverages synthetically generated hauling operations using 3D simulations.
|
|
|
Vision-based construction worker activity analysis informed by body posture
Dominic Roberts,
Wilfredo Torres Calderon,
Shuai Tang,
Mani Golparvar-Fard
Journal of Computing in Civil Engineering, 2020
A vision-based activity analysis method that leverages 2D pose estimation outputs that are used in many state-of-the-art construction worker ergonomics analysis methods.
|
|
|
An Annotation Tool for Benchmarking Methods for Automated Construction Worker Pose Estimation and Activity Analysis
Dominic Roberts,
Mingzhu Wang,
Wilfredo Torres Calderon,
Mani Golparvar-Fard
International Conference on Smart Infrastructure and Construction (ICSIC), 2019
A 2D human pose annotation tool adapted from CVAT that can also annotate per-frame activity labels.
|
|
|
Automated Mining of Construction Schedules for Easy and Quick Assembly of 4D BIM Simulations
Wilfredo Torres Calderon,
Yumo Chi,
Fouad Amer,
Mani Golparvar-Fard
International Conference on Computing in Civil Engineering (i3CE), 2019
An NLP model for automated mapping of raw construction activities to 3D BIM elements.
|
CEE320 Construction Engineering and Management,
Position: Graduate Teaching Assistant,
Period: Fall 2016, Fall 2017, Spring 2018, & Fall 2018.
Professor: Mani Golparvar-Fard
CEE598 Visual Sensing for Civil Infrastructure Engineering and Management,
Position: Graduate Teaching Assistant,
Period: Spring 2017
Professor: Mani Golparvar-Fard
CEE598 Building Information Modeling (BIM),
Position: Graduate Teaching Assistant,
Period: Spring 2019.
Professor: Mani Golparvar-Fard
CS543 Computer Vision,
Position: Graduate Teaching Assistant,
Period: Spring 2019.
Professor: Svetlana Lazebnik
CS445 Computational Photography,
Position: Graduate Teaching Assistant,
Period: Fall 2019 & Spring 2020.
Professor: Derek Hoiem
|
|