Hello world, I am Hao Tan (谭昊). I have joined Adobe Research in Aug 2021. I was a Ph.D. student at UNC CS department from 2016 to 2021, advised by Mohit Bansal. I was supported by Bloomberg Data Science Ph.D. Fellowship for my Ph.D. study. Before joining UNC, I received BS in CS from Shanghai Jiao Tong University. I was a member of ACM honored class.

My research goal is to build a virtual world. Specifically, I want to develop the technology that can both generate a virtural world and digitalize the real world. This is the only way that I can close to XR (cross reality).

I previously worked on static 3D generation and feed-forward reconstructions. I am now working on modeling the dynamic and complex worlds. It’s (maybe) obvious that the existing 3D pipeline can not support this goal in a scalable way thus I am considering rebuilding the pipeline. I have been betting on the advance of long-context technology and are actively working on that. We realized that the spirit of 3D is to compress and the compression problem is better to be formulated as long-context.

https://www.cs.unc.edu/~airsplay/

Publications

Long-lrm: Long-sequence large reconstruction model for wide-coverage gaussian splats

Ziwen, Chen., Tan, Hao., Zhang, Kai., Bi, Sai., Luan, Fujun., Hong, Yicong., Li, Fuxin., Xu, Zexiang. (Oct. 19, 2025)

ICCV 2025

VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation

Yu, Shoubin., Liu, Difan., Ma, Ziqiao., Hong, Yicong., Zhou, Yang., Tan, Hao., Chai, Joyce., Bansal, Mohit. (Oct. 19, 2025)

ICCV 2025

Progressive Autoregressive Video Diffusion Models

Xie, Desai., Xu, Zhan., Hong, Yicong., Tan, Hao., Liu, Difan., Liu, Feng., Kaufman, Arie., Zhou, Yang. (Jun. 13, 2025)

Conference on Computer Vision and Pattern Recognition (CVPR 2025) - CVEU workshop

Progressive autoregressive video diffusion models

Xie, Desai., Xu, Zhan., Hong, Yicong., Tan, Hao., Liu, Difan., Liu, Feng., Kaufman, Arie., Zhou, Yang. (Jun. 11, 2025)

CVPR 2025 - CVEU workshop

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models

Van Nguyen, Minh., Dernoncourt, Franck., Yoon, David., Deilamsalehy, Hanieh., Tan, Hao., Rossi, Ryan., Tran, Quan., Bui, Trung., Nguyen, Thien. (Sep. 5, 2024)

Interspeech 2024

Building Vision-Language Models on Solid Foundations with Masked Distillation

Sameni, Sepehr., Kafle, Kushal., Tan, Hao., Jenni, Simon. (Jun. 17, 2024)

CVPR 2024

LRM: Large Reconstruction Model for Single Image to 3D

Hong, Yicong., Zhang, Kai., Gu, Jiuxiang., Bi, Sai., Zhou, Yang., Liu, Difan., Liu, Feng., Sunkavalli, Kalyan., Bui, Trung., Tan, Hao. (May. 7, 2024)

ICLR 2024

Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model

Li, Jiahao., Tan, Hao., Zhang, Kai., Xu, Zexiang., Luan, Fujun., Xu, Yinghao., Hong, Yicong., Sunkavalli, Kalyan., Shakhnarovich, Greg., Bi, Sai. (May. 7, 2024)

ICLR 2024

DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents

Liu, Fuxiao., Tan, Hao., Tensmeyer, Chris. (Apr. 26, 2024)

ArXiv

Learning Navigational Visual Representations with Semantic Map Supervision

Hong, Yicong., Zhou, Yang., Zhang, Ruiyi., Dernoncourt, Franck., Bui, Trung., Gould, Stephen., Tan, Hao. (Oct. 6, 2023)

ICCV 2023

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

Lai, Viet., Salinas, Abel., Tan, Hao., Bui, Trung., Tran, Quan., Yoon, David., Deilamsalehy, Hanieh., Dernoncourt, Franck., Nguyen, Thien. (Aug. 24, 2023)

Interspeech 2023

News