About
I am a researcher at RobbyAnt, where I work on world models for robot control.
I received my Ph.D. from the Multimedia Lab (MMLab) at The Chinese University of Hong Kong in 2025, advised by Bolei Zhou and Dahua Lin. During my Ph.D. I was a visiting student at Stanford, advised by Gordon Wetzstein.
My research focuses on generative models, particularly in the 3D and video domains, and their use as world models for embodied agents. Along the way I spent time at Apple MLR, Snap Research, Shanghai AI Lab, and SenseTime Research.
Selected Publications
RepWAM: World Action Modeling with Representation Visual-Action Tokenizers
Next Forcing: Causal World Modeling with Multi-Chunk Prediction
Causal World Modeling for Robot Control
World-consistent Video Diffusion with Explicit 3D Modeling