/3D-Diffusion

A repository about 3D Diffusion.

Repository for 3D Diffusion Generation Papers

This is a repository for 3D Diffusion Generation research papers digest. The taxonomy and papers highly refer the survey paper (state of the art on diffusion models for visual computing) and the repo: https://github.com/cwchenwang/awesome-3d-diffusion?tab=readme-ov-file

Survey Paper

  1. state of the art on diffusion models for visual computing 🌟🌟🌟🌟🌟
    This survey paper provides insightful introductions about diffusion models and critical applications, including 2D, 3D, video, and 4D.

Direct 3D Generation via Diffusion Models

This series of papers are focused on modeling the distribution of 3D shapes. Eventually, 3D content/model can be directly generated by the well-trained 3D diffusion models. image Taxonomy: the type of output, representation,

Point cloud

Mesh

Optimization-based generation via Score Distillation Sampling

This series of papers aim to leverage the 2D diffusion models (pre-trained or train from scratch) to generate high-quality and diverse 3D content. In this mode, the 3D contents are not generated directly by diffusion models. Especially, they try to leverage the score information from 2D diffusion models to optimize the 3D reconstructor.

  1. DreamFusion: Text-to-3D using 2D Diffusion 🌟🌟🌟🌟🌟 $\textit{ICLR 2023 Outstanding Paper Award}$
    This paper firstly proposes using SDS to generate 3D contents based on the pre-trained diffusion models. (NeRF)
  2. Magic3D: High-Resolution Text-to-3D Content Creation 🌟🌟🌟🌟 $\textit{CVPR 2023}$
    Based on the SDS, they propose the coarse-to-fine two-stage optimization to generate high-resolution 3D output efficiently. (NeRF & Mesh)
  3. Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation 🌟🌟🌟 $\textit{ICCV 2023}$
    They
  4. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation 🌟🌟🌟🌟🌟 $\textit{NeurIPS 2023 Spotlight}$
    This paper proposes the (Variational Score Distillation) VSD to enhance the quality and diversity of 3D contents. (NeRF & Mesh)
  5. NFSD: Noise Free Score Distillation

Multi-view Diffusion

These works aim to train/fine-tune a diffusion model to generate multi-view images given single image. Regrading to the model and output, these works are roughly divided into three categories. First, the output is color images with 3D consistency based on 2D diffusion models. Second, the output is color images and geometry images (depth image, normal map, .etc.) based on 2D diffusion models. Third, the information from 2D and 3D diffusion models are combined together.

Multi-view color images

  1. RealFusion: 360° Reconstruction of Any Object from a Single Image 🌟🌟🌟 $\textit{CVPR 2023}$
    Reconstruction loss and SDS are used to reconstruct the object based on a given image.
  2. Zero-1-to-3: Zero-shot One Image to 3D Object 🌟🌟🌟🌟🌟 $\textit{ICCV 2023}$
    Zero-shot transfer, single image input, and 3D generation content (meanings of zero, one, and three).
  3. Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors 🌟🌟🌟🌟 $\textit{ICCV 2023}$
    The idea is similar to Magic which is composed of two stages. Single image input to 3D content (meanings of one, two, three).
  4. []

Multi-view color + geometry images

  1. Wonder3D: Single Image to 3D using Cross-Domain Diffusion 🌟🌟🌟🌟🌟 Fine-tune a pre-trained diffusion model to output consistent color images and normal maps and synthesize 3D model based on th cross-domian images (SDF, .etc.)

3D-Aware Image Diffusion (2D + 3D diffusion)

  1. DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation 🌟🌟🌟🌟🌟

Generalizable Architecture