/E-D3DGS

[ECCV 2024] Official repository for "Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting"

Primary LanguagePythonOtherNOASSERTION

E-D3DGS : Embedding-Based Deformable 3D Gaussian Splatting (ECCV 2024)

arXiv project_page

Jeongmin Bae1*, Seoha Kim1*, Youngsik Yun1,
Hahyun Lee2 , Gun Bang2, Youngjung Uh1†

1Yonsei University   2Electronics and Telecommunications Research Institute (ETRI)
* Equal Contributions   Corresponding Author


Official repository for "Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting".
Our approach employs per-Gaussian latent embeddings to predict deformation for each Gaussian and achieves a clearer representation of dynamic motion.

Alt Text

Environmental Setup

Please follow the 3DGS to install the relative packages.

git clone https://github.com/JeongminB/E-D3DGS.git
cd E-D3DGS
git submodule update --init --recursive

conda create -n ed3dgs python=3.7 
conda activate ed3dgs

# If submodules fail to be downloaded, refer to the repository of 3DGS  
pip install -r requirements.txt
pip install -e submodules/diff-gaussian-rasterization/
pip install -e submodules/simple-knn/ 

We use pytorch=1.13.1+cu116 in our environment.

Data Preparation

Downloading Datasets:
Please download datasets from their official websites : HyperNerf, Neural 3D Video and Technicolor

  • Please remove 'cam13.mp4' and corresponding pose from coffee_martini scene in the Neural 3D Video dataset.
  • We split the entire flame_salmon_1_split scene into four 300-frame scenes.

Extracting point clouds from COLMAP:

# setup COLMAP 
bash script/colmap_setup.sh
conda activate colmapenv 

# automatically extract the frames and reorginize them
python script/pre_n3v.py --videopath <dataset>/<scene>
python script/pre_technicolor.py --videopath <dataset>/<scene>
python script/pre_hypernerf.py --videopath <dataset>/<scene>

# downsample dense point clouds
python script/downsample_point.py \
<location>/<scene>/colmap/dense/workspace/fused.ply <location>/<scene>/points3D_downsample.ply

After running COLMAP, Neural 3D Video and Technicolor datasets are orginized as follows:

├── data
│   | n3v
│     ├── cook_spinach
│       ├── colmap
│       ├── images
│           ├── cam01
│               ├── 0000.png
│               ├── 0001.png
│               ├── ...
│           ├── cam02
│               ├── 0000.png
│               ├── 0001.png
│               ├── ...
│     ├── cut_roasted_beef
|     ├── ...

Training

If you want to train with 2x downsampled images, add -r 2 to the command line.

# Train
python train.py -s $GT_PATH/$SCENE --configs arguments/$DATASET/$CONFIG.py --model_path $OUTPUT_PATH --expname $DATASET/$SCENE

Rendering

The current code does not yet support video rendering for visualization (e.g., spiral path rendering).

# Render test view only
python render.py --model_path $OUTPUT_PATH --configs arguments/$DATASET/$CONFIG.py --skip_train --skip_video

# Render train view only
python render.py --model_path $OUTPUT_PATH --configs arguments/$DATASET/$CONFIG.py --skip_test --skip_video

Evaluation

Note: In our paper, we calculate FPS by measuring rendering time only (except for save_image, etc.).

# Evaluate
python metrics.py --model_path $SAVE_PATH/$DATASET/$CONFIG

Note

  • We provide scripts that collectively perform training, rendering, and evaluation. See the train_<dataset_name>.sh.
  • You will need to configure the dataset path according to your system.
  • In the config file, make sure that the total_num_frames and maxtime are equal to the total number of training frames.

Acknowledgements

This code is based on 3DGS, 4DGaussians and STG. In particular, we used 4DGaussians as a starting point for our study. We would like to thank the authors of these papers for their hard work. 😊

BibTex

@inproceedings{bae2024ed3dgs,
    title={Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting}, 
    author={Bae, Jeongmin and Kim, Seoha and Yun, Youngsik and Lee, Hahyun and Bang, Gun and Uh, Youngjung}, 
    booktitle = {European Conference on Computer Vision (ECCV)},
    year={2024}
}