Shaocong Dong* · Lihe Ding* · Zhanpeng Huang · Zibin Wang · Tianfan Xue · Dan Xu
Paper | Project Page | Video
- 🔥 Interactive3D got accepted by CVPR24.
- release the refinement code based on threestudio.
- support Pixart-alpha as 2D guidance.
- we are developing a user-friendly interface and will release the stage-I implementation together.
conda create -n inter python=3.9
conda activate inter
# Newer pip versions, e.g. pip-23.x, can be much faster than old versions, e.g. pip-20.x.
# For instance, it caches the wheels of git packages to avoid unnecessarily rebuilding them later.
python3 -m pip install --upgrade pip
- we have tested cuda 11.7 and torch 2.0.1, but other versions also work fine.
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
- Install dependencies:
pip install -r requirements.txt
# install terminfo
sudo apt-get install libncurses5-dev libncursesw5-dev
- [Optional] Install gsgen, more information can be found here
git clone https://github.com/gsgen3d/gsgen.git
cd gsgen/gs
./build.sh
note: refer to threestudio for some installation issues.
- Start from a given gaussian splatting result or generate one by:
# the checkpoint will be saved to gsgen folder
cd ../../gsgen
python main.py --config-name=base prompt.prompt="<prompt>"
- convert gaussian splatting to NeRF:
# feel free to adjust the hparams in the config
cd ..
python launch.py --config configs/fit_gs.yaml --train --gpu 0 system.prompt_processor.prompt="your prompt" trainer.max_steps=1800
- [Optional] refine the geometry:
# since we find that the text-to-img guidance sometimes changes the content, we use img-to-img guidance now
# feel free to adjust the parameters sunch as training steps
python launch.py --config configs/geo_refine.yaml --train --gpu 0 system.prompt_processor.prompt="your prompt" resume=path_to_your/ckpts/last.ckpt trainer.max_steps=20000 system.init_type='gsgen' system.only_super=True
# after geometry refinement, run the following script to get a standard nerf representation for further usage
python launch.py --config configs/post_geo_refine.yaml --train --gpu 1 system.prompt_processor.prompt="your prompt" trainer.max_steps=3000
- refine the interested region
note: export the terminfo path if you are using tmux
export TERMINFO=path_to/terminfo
# 1. save occ for interested region selection, the occ grid will be saved to debug_data
python launch.py --config configs/interested_refine.yaml --train --gpu 0 system.prompt_processor.prompt="your prompt" resume=path_to_your/ckpts/last.ckpt trainer.max_steps=20000 system.init_type='threestudio' system.only_super=True system.renderer.save_occ=True
# 2. use the utils we provided to select interested regions on your local machine, then upload the selected index npy to debug_data (more adavanced version including SAM will be released together with the interface)
python utils/region_select_tool.py
# 3. we provide a simple command controller to adjust camera now, which will be upgraded in the final interface.
python launch.py --config configs/interested_refine.yaml --train --gpu 0 system.prompt_processor.prompt="your prompt" resume=path_to_your/ckpts/last.ckpt trainer.max_steps=20000 system.init_type='threestudio' system.only_super=True
# optional: you can also try pixart-alpha as guidance by:
python launch.py --config configs/interested_refine_pixart.yaml --train --gpu 0 system.prompt_processor.prompt="your prompt" resume=path_to_your/ckpts/last.ckpt trainer.max_steps=10000 system.init_type='threestudio' system.only_super=True
we will update more detailed illustrations soon.
Interactive3D is built on many amazing research works, thanks a lot to all the authors for sharing! Thank Yiyuan for the valuable discussion and paper refinement.
If the paper and the code are helpful for your research, please kindly cite:
@article{dong2024interactive3d,
title={Interactive3D: Create What You Want by Interactive 3D Generation},
author={Dong, Shaocong and Ding, Lihe and Huang, Zhanpeng and Wang, Zibin and Xue, Tianfan and Xu, Dan},
journal={arXiv preprint arXiv:2404.16510},
year={2024}
}