Animesh Karnewar

I am currently a Senior Research Scientist at Qualcomm AI Research working on Efficient and Controllable Video Generative AI.

I completed my Ph.D. from UCL, London; during which I was also a Visiting Researcher at Meta GenAI research, London. I was a recepient of the prestigious Marie Curie Fellowship for my PhD, being a part of the PRIME-ITN . My research titled "Towards Computationally Efficient, Photo-realistic, Large-scale, 3D Generative Modelling" was supervised by Prof. Niloy J. Mitra and Prof. Tobias Ritschel.

I have collaborated with Oliver Wang (now Google Deepmind, prev. Adobe Research) during my PhD; and with David Novotny (now SpAItialAI, prev. Meta GenAI London) and Prof. Andrea Vedaldi (Univeristy of Oxford) during a Research Scientist Internship and as a Visiting Researcher at Meta GenAI, London.

Prior to this I worked as an R&D Engineer at TomTom, Amsterdam. I have done my Bachelors in Computer Science from PICT Pune, as a university topper with 4.0/4.0 GPA.

Apart from this, I am a hobbyist photographer and a Videographer. Instagram: @akanimax3.

Google Scholar / Twitter / Github / YouTube / Medium / Letterboxd

Awards and Acknowledgements

Rabin Ezra Scholarship, 2023
Marie Curie Fellowship, 2021
"The Hindu" newspaper's article covering my research, 2018 [local-copy in case of paywall]
Graduated as Department Topper (Computer Science), 2017
University Topper award, 2015

	Neodragon: Mobile Video Generation using Diffusion Transformer Animesh Karnewar, Denis Korzhenkov, Ioannis Lelekas, Noor Fathima, Adil Karjauv, Hanwen Xiong, Vancheeswaran Vaidyanathan, Will Zeng, Rafael Esteves, Tushar Singhal, Fatih Porikli, Mohsen Ghafoorian, Amirhossein Habibian, WhitePaper, 2025 project page / arXiv We introduce Neodragon, a text-to-video system capable of generating 2s (49 frames @24 fps) videos at the [640x1024] resolution directly on a Qualcomm Hexagon NPU in a record 6.7s (7 FPS) and a Vbench score of 81.61.
	Meta 3D Gen Raphael Bensadoun, Tom Monnier, Yanir Kleiman, Filippos Kokkinos, Yawar Siddiqui, Mahendra Kariya, Omri Harosh, Roman Shapovalov, Benjamin Graham, Emilien Garreau, Animesh Karnewar, Ang Cao, Idan Azuri, Iurii Makarov, Eric-Tuan Le, Antoine Toisoul, David Novotny, Oran Gafni, Natalia Neverova, Andrea Vedaldi WhitePaper, 2024 project page / arXiv We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute.
	GOEmbed: Gradient Origin Embeddings for representation agnostic 3D Feature Learning Animesh Karnewar, Roman Shapovalov, Tom Monnier, Andrea Vedaldi, Niloy J. Mitra, David Novotny ECCV, 2024 project page / arXiv We propose the GOEmbed (Gradient Origin Embeddings) that encodes source views (observations) into arbitrary 3D Radiance-Field representations while trying to maximize the transfer of source information.
	HoloFusion: Towards Photo-realistic 3D Generative Modeling Animesh Karnewar, Niloy J. Mitra, Andrea Vedaldi, David Novotny ICCV, 2023 project page / video / arXiv We propose HoloFusion to generate photo-realistic 3D radiance fields by extending the HoloDiffusion method with a jointly trained 2D 'super resolution' network.
	HoloDiffusion: Training a 3D Diffusion Model using 2D Images Animesh Karnewar, Andrea Vedaldi, David Novotny, Niloy J. Mitra CVPR, 2023 project page / video / arXiv / code We present HoloDiffusion as the first 3D-aware generative diffusion model that produces 3D-consistent images while being trained with only posed image supervision.
	3inGAN: Learning a 3D Generative Model from Images of a Self-similar Scene Animesh Karnewar, Oliver Wang, Tobias Ritschel, Niloy J. Mitra 3DV, 2022 project page / video / arXiv / code We introduce 3inGAN, an unconditional 3D generative model trained from 2D images of a single self-similar 3D scene.
	ReLU Fields: The Little Non-linearity That Could Animesh Karnewar, Tobias Ritschel, Oliver Wang, Niloy J. Mitra SIGGRAPH, 2022 project page / video / arXiv / code We present a method to represent complex signals such as images or 3D scenes on regularly sampled grid vertices. Our method is able to match the expressiveness of coordinate-based MLPs while retaining reconstruction and rendering speed of voxel grids, without requiring any neural networks or sparse data structures. As a result it converges significantly faster.
	RGBD-Net: Predicting Color and Depth images for Novel Views Synthesis Phong Nguyen, Animesh Karnewar, Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila *3DV, 2021 video / arXiv / code We propose a new cascaded architecture for novel view synthesis, called RGBD-Net*, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network. The former one predicts depth maps of the target views by using adaptive depth scaling, while the latter one leverages the predicted depths and renders spatially and temporally consistent target images.
	MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks Animesh Karnewar, Oliver Wang *CVPR*, 2020 video / arXiv / code We propose the MSG-GAN (Multi-Scale Gradient Generative Adversarial Network), a simple but effective technique for addressing the instability problem of GANs by allowing the flow of gradients from the discriminator to the generator at multiple scales. This technique provides a stable approach for high resolution image synthesis, and serves as an alternative to the commonly used progressive growing technique.

Academic service

Reviewer at ICLR 2025
Reviewer at ICCV/ECCV 2021, 2022, 2023, 2024, 2025
Reviewer at CVPR 2021, 2022, 2023, 2024, 2025
Reviewer at SIGGRAPH 2024
Reviewer at Pacific Graphics 2024
Reviewer at AAAI 2024

Teaching and Research Talks

TA for MLVC course UCL 2022
TA for MLVC course UCL 2021
[Talk] TomTom AI summer school, Amsterdam, 2019
[Talk] MIT, Pune, 2019
[Talk] Humans of Analytics Fireside-chat, 2018 [YouTube video]

This webpage is created using Jon Barron's template.