Publications
* equal contribution · † project lead
2026
RepWAM: World Action Modeling with Representation Visual-Action Tokenizers
Next Forcing: Causal World Modeling with Multi-Chunk Prediction
Causal World Modeling for Robot Control
2025
World-consistent Video Diffusion with Explicit 3D Modeling
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
2024
DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models
Urban Scene Diffusion through Semantic Occupancy Map
SceneWiz3D: Towards Text-guided 3D Scene Composition
BerfScene: BEV-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation
2023
Learning Modulated Transformation in GANs
Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding
Towards Smooth Video Composition
2022
Generative Category-Level Shape and Pose Estimation with Semantic Primitives
Learn to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Learning
2021
F³A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks
Improving the Generalization of End-to-End Driving through Procedural Generation