About

I am a researcher at RobbyAnt, where I work on world models for robot control.

I received my Ph.D. from the Multimedia Lab (MMLab) at The Chinese University of Hong Kong in 2025, advised by Bolei Zhou and Dahua Lin. During my Ph.D. I was a visiting student at Stanford, advised by Gordon Wetzstein.

My research focuses on generative models, particularly in the 3D and video domains, and their use as world models for embodied agents. Along the way I spent time at Apple MLR, Snap Research, Shanghai AI Lab, and SenseTime Research.

Selected Publications

RepWAM: World Action Modeling with Representation Visual-Action Tokenizers
RepWAM: World Action Modeling with Representation Visual-Action Tokenizers Junke Wang, Qihang Zhang, Shuai Yang, Yiming Luo, Yujun Shen, Zuxuan Wu, Yu-Gang Jiang, Yinghao Xu Preprintpaperwebsite
Next Forcing: Causal World Modeling with Multi-Chunk Prediction
Next Forcing: Causal World Modeling with Multi-Chunk Prediction Gangwei Xu, Qihang Zhang†, Jiaming Zhou, Xing Zhu, Yujun Shen, Xin Yang, Yinghao Xu Preprintpaperwebsite
Causal World Modeling for Robot Control
Causal World Modeling for Robot Control Lin Li*, Qihang Zhang*†, Yiming Luo*, Shuai Yang, Ruilin Wang, Luyao Zhang, Mingrui Yu, Zelin Gao, Nan Xue, Boyu Zhou, Xing Zhu, Mingyu Ding, Yujun Shen, Yinghao Xu RSS 2026paperwebsite
World-consistent Video Diffusion with Explicit 3D Modeling
World-consistent Video Diffusion with Explicit 3D Modeling Qihang Zhang, Shuangfei Zhai, Miguel Angel Bautista Martin, Kevin Miao, Alexander Toshev, Josh Susskind, Jiatao Gu CVPR 2025, Highlightpaperwebsite

full list →