|
Sucheng Ren
Hi, I am Sucheng Ren (ไปป่ๆ), a Computer Science Ph.D. student at Johns Hopkins University, where I am fortunate to be advised by Professor Alan Yuille and Prof. Cihang Xie. I received my B.S. and M.S. degree in Computer Science from South China University of Technology advised by Prof. Shengfeng He. Currently, I am a research intern at Apple. Previously, I spent great time at Bytedance Seed, Microsoft Research Asia (MSRA), Tsinghua University and National University of Singapore.
My research lies at the Diffusion/Autoregressive based Generative Model and Multimodal Learning.
Email  | 
CV  | 
Scholar  | 
Github  | 
|
|
- [Feb. 2026] FreqFlow got accepted by CVPR2026, M-VAR got accepted by CVPR2026 Findings!๐
- [Jun. 2025] xAR got accepted by ICCV2025!๐
- [May. 2025] FlowAR got accepted by ICML2025!๐
- [Jan. 2025] ARM got accepted by ICLR2025!๐
- [Jan. 2025] ARVideo got accepted by TMLR!๐
- [May. 2024] D-iGPT got accepted by ICML2024 as Oral presentation!๐
- [Aug. 2023] Join Johns Hopkins University as a PhD student!
- [Jul. 2023] SG-Former got accepted by ICCV2023!๐
- [Feb. 2023] TinyMIM got accepted by CVPR2023!๐
|
|
M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Sucheng Ren,
Yaodong Yu,
Nataniel Ruiz,
Feng Wang,
Alan Yuille,
Cihang Xie
IEEE Conference on Computer Vision and Pattern Recognition (CVPR Findings), 2026
[paper]
[code]
[bibtex]
We decouple scale-wise attention which allows to rebuild VAR in a more computationally efficient manner.
|
|
|
Autoregressive Pretraining with Mamba in Vision
Sucheng Ren,
Xianhang Li,
Haoqin Tu,
Feng Wang,
Fangxun Shu,
Lei Zhang,
Jieru Mei,
Linjie Yang,
Peng Wang,
Heng Wang,
Alan Yuille,
Cihang Xie
International Conference on Learning Representation (ICLR), 2025
[paper]
[code]
[bibtex]
We are the first to pretrain Mamba in vision with Cluster-based autoregressive modeling
|
|
|
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
Sucheng Ren,
Fangyun Wei,
Zheng Zhang,
Han Hu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[paper]
[code]
[bibtex]
We explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones.
|
|
|
Shunted Self-Attention via Multi-Scale Token Aggregation
Sucheng Ren,
Daquan Zhou,
Shengfeng He,
Jiashi Feng,
Xinchao Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Oral), 2022
[paper]
[code]
[bibtex]
Integrating the capability of capturing multiscale objects in each attention layer by adaptively merging tokens.
|
|
|
SG-Former: Self-guided Transformer with Evolving Token Reallocation
Sucheng Ren,
Xingyi Yang,
Songhua Liu,
Xinchao Wang
International Conference on Computer Vision (ICCV), 2023
[paper]
[code]
[bibtex]
Integrating the capability of capturing multiscale objects in each attention layer by adaptively merging tokens.
|
|
|
Learning from the Master: Distilling Cross-modal Advanced Knowledge for Lip Reading
Sucheng Ren,
Yong Du,
Jianming Lv,
Guoqiang Han,
and Shengfeng He
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[paper]
[bibtex]
Training a master to learn how to teach a better student.
|
|
|
Reciprocal Transformations for Unsupervised Video Object Segmentation
Sucheng Ren,
Wenxi Liu,
Yongtuo Liu,
Haoxin Chen,
Guoqiang Han and
Shengfeng He
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[paper]
[bibtex]
[code]
Jointly learning salient objects, moving objects, recurring objects for Unsupervised Video Object Segmentation.
|
|
|
TENet: Triple Excitation Network for Video Salient Object Detection
Sucheng Ren,
Chu Han,
Xin Yang,
Guoqiang Han and
Shengfeng He
European Conference on Computer Vision (ECCV), 2020
(Spotlight, Acceptance Rate 5.0%)
[paper]
[bibtex]
|
|