HyperDreamer (SIGGRAPH Asia 2023)

Tong Wu, Zhibing Li, Shuai Yang, Pan Zhang, Xingang Pan, Jiaqi Wang, Dahua Lin, Ziwei Liu

Official implementation of HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image

Installation

Install Dependencies:

Install PyTorch >= 1.12. We have tested on torch1.12.1+cu113, but other versions should also work fine.

# torch1.12.1+cu113
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

# install kaolin
pip install kaolin==0.14.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.12.1_cu113.html

Other dependencies:

pip install -r requirements.txt
pip install ./raymarching
pip install ./shencoder
pip install ./freqencoder
pip install ./gridencoder

Download pretrained models

Zero123 for diffusion guidance

cd pretrained/zero123
wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt

Omnidata for depth and normal prediction

mkdir pretrained/omnidata
cd pretrained/omnidata
gdown '1Jrh-bRnJEjyMCS7f-WsaFlccfPjJPPHI&confirm=t' # omnidata_dpt_depth_v2.ckpt
gdown '1wNxVO4vVbDEMEpnAi_jwQObf2MFodcBR&confirm=t' # omnidata_dpt_normal_v2.ckpt

256 resolution tetrahedron for DMTet. Download and move it to tets/
SAM for segmentation

mkdir models
cd models
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

derender3d for derender

mkdir models/co3d
wget -O models/co3d/checkpoint010.pth https://www.robots.ox.ac.uk/~vgg/research/derender3d/data/co3d.pth

PASD for super-resolution module
- Download SD1.5 models from huggingface and put them into PASD/checkpoints/stable-diffusion-v1-5
- Download PASD pre-trained models pasd and place the dictionary checkpoint-100000 inside PASD/runs/pasd/.
Editing for editing (ControlNet-Normal2img)

You can download control_v11p_sd15_normalbae.pth from the HuggingFace Model Page, and put it under pretrained/controlnet/....
You need to download Stable Diffusion 1.5 model "v1-5-pruned.ckpt" and put it under pretrained/controlnet/....

Quickstart

Preprocess the input image to move background and obtain its depth, normal and caption.

python preprocess_image.py /path/to/image.png

We adopt a two-stage training pipeline. You can run it by

image_path='data/strawberry_rgba.png'
nerf_workspace='exp/strawberry_s1'
dmtet_workspace='exp/strawberry_s2'

# Stage 1: NeRF
bash run_nerf.sh ${image_path} ${nerf_workspace}

# Stage 2 DMTet
bash run_dmtet.sh ${image_path} ${nerf_workspace} ${dmtet_workspace}

[optional] We also support importiing pre-defined material masks in the reference view. You can use Semantic-SAM or Materialistic to obtain more accurate masks.

bash run_dmtet.sh ${image_path} ${nerf_workspace} ${dmtet_workspace} --material_masks material_masks/xxx.npy

To relight

bash run_dmtet.sh ${image_path} ${nerf_workspace} ${dmtet_workspace} --test --relight_sg envmaps/lgtSGs_studio.npy

To editing

python editing/scripts/run_editing.py --config_path=editing/configs/sculpture.yaml

Gradio Demo (Editing)

python editing/app_edit.py

TODO

Release editing code.

Acknowledgement

This code is built on the open-source projects stable-dreamfusion, Zero123, derender3d, SAM and PASD.

Thanks to the maintainers of these projects for their contribution to the community!

Citation

If you find HyperDreamer helpful for your research, please cite:

@InProceedings{wu2023hyperdreamer,
  author = {Tong Wu and Zhibing Li and Shuai Yang and Pan Zhang and Xingang Pan and Jiaqi Wang and Dahua Lin and Ziwei Liu},
  title = {HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image},
  journal={ACM SIGGRAPH Asia 2023 Conference Proceedings},
  year={2023}
}

wutong16/HyperDreamer