/CIPS-3Dplusplus

3D GAN inversion

Primary LanguagePythonMIT LicenseMIT

CIPS-3D++

Paper

3D-aware GAN inversion
FlipInversion_profile.mp4

Preparing envs

bash exp/tests/setup_env_debug.sh

# Install pytorch3d
git clone https://github.com/facebookresearch/pytorch3d
cd pytorch3d
pip install -e .

Quick start

Pretrained models

https://github.com/PeterouZh/CIPS-3Dplusplus/releases/download/v1.0.0/pretrained.zip

Faces

  • Sampling multi-view images
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_ffhq_v10;\
  Testing_train_cips3d_ffhq_v10().test__sample_multi_view_web(debug=False)" \
  --tl_opts port 8501
Results

  • Inversion
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_ffhq_v10;\
  Testing_train_cips3d_ffhq_v10().test__flip_inversion_web(debug=False)" \
  --tl_opts port 8501
Results

  • Rendering multi-view images Note that in order to render multi-view images, you must first perform inversion (the previous step). When you have done the inversion, you can execute the command below to render multi-view images:
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_ffhq_v10;\
  Testing_train_cips3d_ffhq_v10().test__render_multi_view_web(debug=False)" \
  --tl_opts port 8501
Results

  • Stylization
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_ffhq_v10;\
  Testing_train_cips3d_ffhq_v10().test__interpolate_decoder_web(debug=False)" \
  --tl_opts port 8501
Results

  • Style mixing
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_ffhq_v10;\
  Testing_train_cips3d_ffhq_v10().test__style_mixing_web(debug=False)" \
  --tl_opts port 8501
Results

  • Time
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_ffhq_v10;\
  Testing_train_cips3d_ffhq_v10().test__rendering_time(debug=False)"

Cars

  • Sampling multi-view images
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_compcars_v10;\
  Testing_train_cips3d_compcars_v10().test__sample_multi_view_web(debug=False)" \
  --tl_opts port 8501
Results

  • Inversion
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_compcars_v10;\
  Testing_train_cips3d_compcars_v10().test__flip_inversion_web(debug=False)" \
  --tl_opts port 8501

Please note, car inversion is only a preliminary experiment and the results are not stable. The main difficulty lies in accurately estimating the camera's pose for the car. Currently we mitigate this issue by setting an appropriate initial camera pose for the original image and the flipped image. (see the azim_init)

Results

  • Rendering multi-view images Note that in order to render multi-view images, you must first perform inversion (the previous step). When you have done the inversion, you can execute the command below to render multi-view images:
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=.:exp
python -c "from exp.tests.test_cips3dpp import Testing_train_cips3d_compcars_v10;\
    Testing_train_cips3d_compcars_v10().test__render_multi_view_web(debug=False)" \
    --tl_opts port 8501
Results

Training & Evaluation

tree datasets/ffhq/images1024x1024
datasets/ffhq/images1024x1024
├── 00000
│   ├── 00000.png
│   ├── 00001.png
│   ├── xxx
├── 01000
    ├── 01000.png
    ├── 01001.png
    └── xxx
  • Preparing lmdb dataset
bash exp/cips3d/bash/preparing_dataset/prepare_data_ffhq_r1024.sh
  • Training
bash exp/cips3d/bash/train_cips3d_ffhq_v10/train_r1024_r64_ks1.sh
  • Evaluation
bash exp/cips3d/bash/train_cips3d_ffhq_v10/eval_fid_r1024.sh

Finetuning on Disney Faces

tree datasets/Disney_cartoon/cartoon
datasets/Disney_cartoon/cartoon/
├── Cartoons_00002_01.jpg
├── Cartoons_00003_01.jpg
├── xxx
  • Preparing lmdb dataset
rm -rf datasets/Disney_cartoon/cartoon/.ipynb_checkpoints/
pushd datasets/Disney_cartoon/cartoon/
mkdir images
mv *jpg images/
popd
bash exp/cips3d/bash/preparing_dataset/prepare_data_disney_r1024.sh
  • Finetuning
bash exp/cips3d/bash/train_cips3d_ffhq_v10/finetune_r1024_r64_ks1_disney.sh

Explanation for the name of CIPS-3D

The generator consists of a NeRF and a 2D MLP. Thus it is conditionally-independent for pixel synthesis. We explain the concept of CIPS from a probabilistic perspective. Let `a` and `b` be two pixels of an image `x` sharing the same style vector `w` but are synthesized individually. According to the directed acyclic graph (DAG) above, we have

According to the conditional probability formula, we have

Combining the above formulas, we get

Hence, `a` and `b` are conditionally independent given `w`.