I have collaborated with Oliver
Wang from Adobe Research during my PhD; and with David Novotny and Prof. Andrea Vedaldi
from Meta GenAI London during a research scientist
internship and as a Visiting Researcher.
Prior to this I worked as an R&D Engineer at TomTom, Amsterdam.
I have done my Bachelors in Computer Science from PICT Pune, as a university topper with
4.0/4.0 GPA.
Apart from this, I am a movie lover, an exploratory-casual-gamer, and a hobbyist photographer.
Feel free to check out my photography work on Instagram: @akanimax3.
I am currently interested in research on applying Generative Modelling in a scalable
and efficient manner to large-scale static 3D assets geared specifically towards creative 3D
applications like Movies and Video Games. Other interests include Physically
based rendering, Differentiable rendering and efficient 3D representations.
Meta 3D Gen
Raphael Bensadoun,
Tom Monnier,
Yanir Kleiman,
Filippos Kokkinos,
Yawar Siddiqui,
Mahendra Kariya,
Omri Harosh,
Roman Shapovalov,
Benjamin Graham,
Emilien Garreau,
Animesh Karnewar,
Ang Cao,
Idan Azuri,
Iurii Makarov,
Eric-Tuan Le,
Antoine Toisoul,
David Novotny,
Oran Gafni,
Natalia Neverova,
Andrea Vedaldi WhitePaper, 2024
project page
/
arXiv
We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset
generation. 3DGen offers 3D asset creation with high prompt fidelity and
high-quality 3D shapes and textures in under a minute.
We propose the GOEmbed (Gradient Origin Embeddings) that encodes
source views (observations) into arbitrary 3D Radiance-Field
representations while trying to maximize the transfer of source information.
We propose HoloFusion to generate photo-realistic 3D radiance fields by extending the HoloDiffusion
method with a jointly trained 2D 'super resolution' network.
We present HoloDiffusion as the first 3D-aware generative diffusion model that produces
3D-consistent images while being trained with only posed image supervision.
We present a method to represent complex signals such as images or 3D scenes on
regularly sampled grid vertices. Our method is able to match the expressiveness of coordinate-based
MLPs while retaining reconstruction and rendering speed of voxel grids, without requiring any neural
networks
or sparse data structures. As a result it converges significantly faster.
We propose a new cascaded architecture for novel view synthesis, called RGBD-Net,
which consists of two core components: a hierarchical depth regression network and a depth-aware
generator network. The former one predicts depth maps of the target views by using adaptive depth
scaling, while the latter one leverages the predicted depths and renders spatially and temporally
consistent target images.
We propose the MSG-GAN (Multi-Scale Gradient Generative Adversarial Network), a simple but
effective technique for addressing the instability problem of GANs by allowing the flow
of gradients from the discriminator to the generator at multiple scales. This technique
provides a stable approach for high resolution image synthesis, and serves as an alternative
to the commonly used progressive growing technique.
Academic service
Reviewer at ICCV/ECCV 2021, 2022, 2023, 2024
Reviewer at CVPR 2021, 2022, 2023, 2024
Reviewer at SIGGRAPH 2024
Reviewer at Pacific Graphics 2024
Reviewer at AAAI 2024