|
Nupur Kumari
I am a final year PhD student at Robotics Institute, Carnegie Mellon University (CMU). I am advised by Jun-Yan Zhu and collaborate closely with
Richard Zhang and Eli Shechtman. My research interests lie in computer vision, specifically, generative models, model customization, and post-training techniques.
Prior to CMU, I worked at Media and Data Science Research, Adobe India, and had the pleasure to
collaborate with Vineeth N Balasubramanian during that time. I did my undergraduate from Indian Institute of Tenchnology Delhi with a major in Mathematics and Computing.
Email  / 
LinkedIn  / 
Resume  / 
Google Scholar
|
 |
Selected Publications
|
NP-Edit: Learning an Image Editing Model without Image Editing Pairs
Nupur Kumari, Sheng-Yu Wang,
Nanxuan Zhao,
Yotam Nitzan,
Yuheng Li,
Krishna Kumar Singh,
Richard Zhang,
Eli Shechtman,
Jun-Yan Zhu,
Xun Huang
We propose NP-Edit (No-Pair Edit), a framework for training image editing models using gradient feedback from a Vision–Language Model (VLM), requiring no paired supervision. Our formulation combines VLM feedback with distribution matching loss to learn a few-step image editing model. We show that performance improves directly with more powerful VLMs and larger datasets, demonstrating its strong potential and scalability.
ICLR 2026.
[Paper]
[Webpage]
|
|
Generating Multi-Image Synthetic Data for Text-to-Image Customization
Nupur Kumari, Xi Yin, Jun-Yan Zhu,Ishan Misra, Samaneh Azadi
We propose a data generation pipeline for image customization consisting of multiple images of the same object in different contexts. Given the training data, we train a new encoder-based model for the task, which can successfully generate new compositions of a reference object using text prompts.
ICCV 2025.
[Paper]
[Webpage]
[Code]
|
|
Generative Photomontage
Sean J. Liu, Nupur Kumari,
Ariel Shamir, Jun-Yan Zhu
We propose a framework for creating the desired image by compositing it from various parts of generated images, in essence forming a Generative Photomontage.
CVPR 2025.
[Paper]
[Webpage]
[Code]
|
|
Customizing Text- to-Image Diffusion with Object Viewpoint Control
Nupur Kumari*, Grace Su*, Richard Zhang, Taesung Park, Eli Shechtman, Jun-Yan Zhu
We propose Custom Diffusion-360, to add object viewpoint control when personalizing text-to-image diffusion models, e.g. Stable Diffusion-XL, given multi-view images of the new object.
SIGGRAPH Asia 2024.
[Paper]
[Webpage]
[Code]
|
|
Customizing Text-to-Image Models with a Single Image Pair
Maxwell Jones, Sheng-Yu Wang, Nupur Kumari, David Bau, Jun-Yan Zhu
We propose PairCustomization, a method to learn new style concepts from a single image pair by decomposing style and content.
SIGGRAPH Asia 2024.
[Paper]
[Webpage]
[Code]
|
|
Multi-Concept Customization of Text-to-Image Diffusion
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
We propose Custom Diffusion, a method to fine-tune large-scale text-to-image diffusion models, e.g. Stable Diffusion, given few
(~4-20) user-provided images of a new concept.
Our method is computationally efficient (~6 minutes on 2 A100 GPUs) and has low storage requirements for
each additional concept model (75MB) apart from the pretrained model.
CVPR 2023.
[Paper]
[Webpage]
[Code]
|
|
Ablating Concepts in Text-to-Image Diffusion Models
Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu
We propose a method to ablate (remove) copyrighted materials and memorized images from pretrained
text-to-image generative models. Our algorithm changes the target concept distribution to an anchor
concept, e.g., Van Gogh painting to paintings or Grumpy cat to Cat.
ICCV 2023.
[Paper]
[Webpage]
[Code]
|
|
Content-Based Search for Deep Generative Models
Daohan Lu*, Sheng-Yu Wang*,
Nupur Kumari*, Rohan Agarwal*, Mia Tang, David Bau, Jun-Yan Zhu
We propose an algorithm for searching over generative models using image,text, and sketch.
Our search platform is available at Modelverse.
SIGGRAPH Asia 2023.
[Paper]
[Webpage]
[Code]
|
|
Ensembling Off-the-shelf Models for GAN Training
Nupur Kumari, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
We show that pretrained computer vision models can significantly improve performance when used in an
ensemble of discriminators.
Our method improves FID by 1.5x to 2x on cat, church, and horse categories of LSUN.
CVPR 2022 (Oral).
[Paper]
[Webpage]
[Code]
|
|
Attributional Robustness Training using Input-Gradient Spatial Alignment
Nupur Kumari*, Mayank
Singh*, Puneet Mangla, Abhishek Sinha,
Vineeth N Balasubramanian, Balaji
Krishnamurthy
We propose a robust attribution training methodology ART that maximizes the alignment between
the input and its attribution map.
ART achieves state-of-the-art performance in attributional robustness and weakly supervised
object localization on CUB dataset.
ECCV 2020.
[Paper]
[Webpage]
[Code]
|
|
Charting the Right Manifold: Manifold Mixup for Few-shot Learning
Puneet Mangla*, Nupur Kumari*, Abhishek Sinha*, Mayank Singh*, Vineeth N Balasubramanian, Balaji
Krishnamurthy
Used self-supervision techniques - rotation and exemplar, followed by manifold mixup for few-shot
classification tasks.
The proposed approach beats the current state-of-the-art accuracy on mini-ImageNet, CUB and CIFAR-FS
datasets by 3-8%.
WACV 2020.
[Paper]
[Code]
|
* denotes equal contribution

|