I am a Research Scientist at ByteDance working on GenAI. Previously, I earned my Ph.D. in Computer Science from Johns Hopkins University, advised by Bloomberg Distinguished Professor Alan L. Yuille. I have a board research experience in vairent computer vision / artificaial intellegence area, including but not limit to Video Generation 1 2 3, 3D vision 4 5 6, Robust Vision 7 8, Differible Rendering 9, and Medical Image Diagnosis 10 11.
I am currently focused on the large-scale post-training and fine-tuning of next-generation video models, including Seedance 2.0/1.0 and Wan 2.1/2.2. My work bridges the gap between foundational research and scalable implementation through three core pillars:
Controllable Synthesis: Developing high-quality, temporally consistent video generation models with a focus on precise user control.
World Modeling via Long Video Generation: Exploring agentic storytelling and continuous long-video generation to move toward robust, large-scale world modeling.
Architectural Optimization: Balancing the trade-offs between computational efficiency and generative quality by advancing foundational model architectures.
Autonomous Data Ecosystems: Scaling training datasets and labeling systems by leveraging agentic AI to automate and optimize the end-to-end data processing pipeline.
Powered by Jekyll and Minimal Light theme.