Irving Fang's Homepage

Irving Fang

I am a Computer Science PhD student at AI4CE Lab@NYU led by Prof. Chen Feng.

I obtained my bachelor's degree from UC Berkeley, double majoring in Data Science (Robotics Emphasis) and Mathematics, with minors in Japanese Literature and EECS.
At UC Berkeley I was fortunate enough to work with Prof. Alice Agogino at her BEST Lab and Squishy Robotics.

Currently, I am interning at Analog Devices (ADI). During Summer 2022, I interned at Mitsubishi Electric Research Laboratories (MERL), working with Dr. Radu Corcodel on tactile sensing and deep reinforcement learning.

In my free time I enjoy (unnecessarily) ricing my Linux system and playing with all kinds of MCU/FPGA boards. I am also a fan of clothing/jewelry design and video games.

Email / CV / Google Scholar / Github

Research

At the broadest level, my research lies at the intersection of robotics, computer vision, and machine learning.

I am particularly interested in contact-rich manipulation: can we make robots as dexterous, adaptive, and efficient as humans when interacting with objects, the environment, or even other people through contact?

I would like to approach this problem with a diverse toolbox, including deep learning, tactile sensing, model predictive control, vision-language models, hardware design, simulation, and even emerging approaches like neuromorphic computing.

In my free time, I also contribute my computational skills to scientific research in other fields such as anthropology.

For collaboration, click here.

	From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models Irving Fang, Juexiao Zhang, Shengbang Tong, Chen Feng, (* for equal contribution) Arxiv (Under Review) project page / arXiv / code Can your VLA generalize like a VLM? INT-ACT may give you some clue.
	FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction Irving Fang, Kairui Shi, Xujin He, Siqi Tan, Yifan Wang, Hanwen Zhao, Hung-Jui Huang, Wenzhen Yuan, Chen Feng, Jing Zhang ( for equal contribution) ICRA 2025 project page / arXiv / code Robot reconstructing visually and geometrically accurate surroundings with sparse visual and tactile data
	VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model Juexiao Zhang, Beicheng Wang, Shuwen Dong†, Irving Fang†, Chen Feng (, †for equal contribution) IROS 2025* project page / arXiv / code Let the robot follow a human's actions by just watching one video.
	EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction Irving Fang, Yuzhong Chen Yifan Wang* Jianghan Zhang†, Qiushi Zhang†, Jiali Xu†, Xibo He, Weibo Gao, Hao Su, Yiming Li, Chen Feng (, †for equal contribution) ICRA 2024* project page / arXiv / code Human-robot interaction for a potentially AR world?
	DeepExplorer: Metric-Free Exploration for Topological Mapping by Task and Motion Imitation in Feature Space Yuhang He, Irving Fang, Yiming Li, Rushi Bhavesh Shah, Chen Feng (* for equal contribution) RSS 2023 project page / arXiv / code A simple and effective framework for efficient and lightweight active visual exploration with only RGB images as input
	Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information Alice Agogino, Hae Young Jang, Vivek Rao, Ritik Batra, Felicity Liao, Rohan Sood, Irving Fang, R Lily Hu, Emerson Shoichet-Bartus, John Matranga (Authors ordered by department affiliation, not contribution) ASME IMECE, 2021 arXiv / code A framework for optimizing the deployment of emergency sensors using Long Short-Term Memory (LSTM) Neural Network and Expected Value of Information (EVI)
	GARF: Learning Generalizable 3D Reassembly for Real-World Fractures Sihang Li, Zeyu Jinag, Grace Chen, Chenyang Xu, Siqi Tan, Xue Wang, Irving Fang, Kristof Zyskowski, Shannon P. McPherron, Radu Iovita, Chen Feng^†, Jing Zhang^† ICCV, 2025 project page / arXiv / code Sheding light on training on synthetic data to advance real-world 3D fracture assembly.
	LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images Jing Zhang, Irving Fang, Hao Wu, Akshat Kaushik, Alice Rodriguez, Hanwen Zhao, Juexiao Zhang, Zhuo Zheng, Radu Iovita, Chen Feng (* for equal contribution) CVPR, 2024. Highlight (11.9% of 2719 accepted papers) project page / arXiv / code Paleoanthropology meets cutting-edge computer vision! We create the first Lithic Use-Wear Analysis (LUWA) dataset and challenge Large Vision Model and Large Language and Vision Model with it.

Personal Projects

Please visit this repo. It contains pointers to some personal projects ranging from robotics to a RISC-V CPU implemented on a Xilinx FPGA board.

Teaching

Teaching Aide, ROB-UY 3203 Robot Vision, Spring 2023
Teaching Aide, ROB-GY 6203 Robot Perception, Fall 2022
Teaching Aide, ROB-UY 3203 Robot Vision, Spring 2022

Service

Reviewer, ICRA2024, ICRA2025
Reviewer, IROS2025
Reviewer, DARS2024

Personal

My wife is a NYU Philosophy PhD-turned-cloud-computing-engineer. In Japanese, there's a phrase—"雲の上人" (kumo no uebito)—which literally means “a person living above the clouds.” It's used to describe someone extraordinarily talented and a bit otherworldly, almost like a dream. To me, she embodies that phrase more than anyone.

The website is based on Dr. Jon Barron's source code