Utkarsh Singhal

I recently joined Tesla Optimus to build world models for robots. Before this, I finished my PhD at Berkeley AI Research Lab where I was advised by Prof. Stella Yu.

Email | GitHub | Google Scholar | Resume

Research

I'm interested in building adaptable and reliable embodied AI. My research interests span test-time optimization, world models, and robotics.

	Test-Time Canonicalization by Foundation Models for Robust Perception Utkarsh Singhal^, Ryan Feng^, Stella Yu, Atul Prakash ICML 2025, 2025 paper \| code \| website We use test-time search to make models approximately invariant to many different transformations without any special training or architectures.
	How to guess a gradient Utkarsh Singhal, Brian Cheung, Kartik Chandra, Jonathan Ragan-Kelley, Joshua Tenenbaum, Tomaso Poggio, Stella Yu Optimization for Machine Learning Workshop (OPT2023), NeurIPS, 2023 paper We use architecture and activations to guess a neural network’s gradients without computing the loss or using backprop.
	Learning to Transform for Generalizable Instance-wise Invariance Utkarsh Singhal, Carlos Esteves, Ameesh Makadia, Stella Yu International Conference on Computer Vision (ICCV), 2023 paper \| code \| website We predict a distribution of transformations for any input image. This can be used for data augmentation, aligning instances, and adapting to out-of-distribution poses.
	Multi-Spectral Image Classification with Ultra-Lean Complex-Valued Models Utkarsh Singhal, Stella X Yu, Zackery Steck, Scott Kangas, Aaron A Reite (Oral) Humanitarian Aid and Disaster Workshop (HADR), NeurIPS, 2022 paper We apply our CDS (co-domain symmetry) work on a small and imbalanced dataset in the MSI classification setting.
	Co-domain symmetry for complex-valued deep learning Utkarsh Singhal, Yifei Xing, Stella Yu Computer Vision and Pattern Recognition (CVPR), 2022 paper \| code We make complex-valued CNNs that are invariant to scale and phase-shifts of the input pixels, and apply it to SAR image classification.
	Fourier features let networks learn high frequency functions in low dimensional domains Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, Ren Ng (Spotlight) NeurIPS, 2021 paper \| code \| website We explain why neural networks fail to learn low-dimensional functions and how position encoding (fourier features) help.
	LO represents motion and semantic categories in addition to object boundaries Utkarsh Singhal, Jack Gallant, Mark Lescroart Journal of Vision, 2019 paper We studied how the Lateral Occipital cortex represents object boundaries using fMRI.

Source code from Leonid Keselman's fork of Jon Barron's website