Spotlight Poster

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

Yinghao Xu ⋅ Hao Tan ⋅ Fujun Luan ⋅ Sai Bi ⋅ Peng Wang ⋅ Jiahao Li ⋅ Zifan Shi ⋅ Kalyan Sunkavalli ⋅ Gordon Wetzstein ⋅ Zexiang Xu ⋅ Kai Zhang

2024 Spotlight Poster

Project Page [ OpenReview]

Abstract

We propose DMV3D, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model incorporates a triplane NeRF representation and, functioning as a denoiser, can denoise noisy multi-view images via 3D NeRF reconstruction and rendering, achieving single-stage 3D generation in the 2D diffusion denoising process. We train DMV3D on large-scale multi-view image datasets of extremely diverse objects using only image reconstruction losses, without accessing 3D assets. We demonstrate state-of-the-art results for the single-image reconstruction problem where probabilistic modeling of unseen object parts is required for generating diverse reconstructions with sharp textures. We also show high-quality text-to-3D generation results outperforming previous 3D diffusion models. Our project website is at: https://dmv3d.github.io/.

Video

Chat is not available.