Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis

Project page | Paper

"Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis"
Jeong-gi Kwak, Yuanming Li, Dongsik Yoon, Donghyeon Kim, David Han, Hanseok Ko
ECCV 2022

This repository includes the official Pytorch implementation of SURF-GAN.

SURF-GAN

SURF-GAN, which is a NeRF-based 3D-aware GAN, can discover disentangled semantic attributes in an unsupervised manner.

(Tranined on 64x64 CelebA and rendered with 256x256)

Get started

Clone the repo.

git clone https://github.com/jgkwak95/SURF-GAN.git
cd SURF-GAN

Create virtual environment

conda create -n surfgan python=3.7.1
conda activate surfgan
conda install -c pytorch-lts pytorch torchvision 
pip install --no-cache-dir -r requirements.txt

Train SURF-GAN

At first, look curriculum.py and specify dataset and training options.

# CelebA
python train_surf.py --output_dir {your-exp-name} 
		     --curriculum CelebA_single

Pretrained model

Pretrained model will be uploaded.

Semantic attribute discovery

Let's traverse each dimension with discovered semantics

python discover_semantics.py  --experiment {your-exp-name}
                              --image_size 256
                              --ray_step_multiplier 2
                              --num_id 9                     
                              --traverse_range 3.0           
                              --intermediate_points 9       
                              --curriculum CelebA_single

The default ckpt file to traverse is the latest file (generator.pth). If you want to check specific cpkt, add this in your command line, for example,

--specific_ckpt 140000_64_generator.pth

Render video

Moving camera

Set the mode: yaw, pitch, fov, etc. You can also make your trajectory.

python render_video.py  --experiment {your-exp-name}
                        --image_size 128
                        --ray_step_multiplier 2
                        --num_frames 100                         
                        --curriculum CelebA_single  
                        --mode yaw

Moving camera with a specific semantic

Choose an attribute that you want to control LiDj.

python render_video_semantic.py  --experiment {your-exp-name}
                                 --image_size 128
                                 --ray_step_multiplier 2
                                 --num_frames 100      
                                 --traverse_range 3.0
                                 --intermediate_points      
                                 --curriculum CelebA_single  
                                 --mode circle
                                 --L 2
                                 --D 4

3D-Controllable StyleGAN

Injecting the 3D prior of SURF-GAN into StyleGAN.

Video

+ Style

Also, it is compatible with numerous StyleGAN-based techniques, e.g., Toonifying.

Limitation

3D controllable StyleGAN is not based on 3D representations such as mesh or NeRF, so as you can see when it comes to video generation, it shows the problem of “texture sticking” pointed out in StyleGAN3 (especially in hair and beard). That is one of the most noticable artifacts in GAN generated videos. We expect this to be mitigated with StyleGAN3.

Citation

@article{kwak2022injecting,
  author    = {Kwak, Jeong-gi and Li, Yuanming and Yoon, Dongsik and Kim, Donghyeon and Han, David and Ko, Hanseok},
  title     = {Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis},
  journal   = {arXiv},
  year      = {2022},
}

Acknowledgments

SURF-GAN is bulided upon the pi-GAN implementation and inspired by EigenGAN.
We used pSp encoder (e4e also works) and StyleGAN2-pytorch to build 3D-controllable StyleGAN.

zhangqianhui/SURF-GAN