This is a PyTorch implementation of the following paper:
GAN-Control: Explicitly Controllable GANs, ICCV 2021, [paper] [project page].
Alon Shoshan, Nadav Bhonker, Igor Kviatkovsky and Gerard Medioni.
Abstract:
We present a framework for training GANs with explicit control over generated facial images.
We are able to control the generated image by settings exact attributes such as age, pose, expression, etc.
Most approaches for manipulating GAN-generated images achieve partial control by leveraging the latent space disentanglement properties, obtained implicitly after standard GAN training.
Such methods are able to change the relative intensity of certain attributes, but not explicitly set their values.
Recently proposed methods, designed for explicit control over human faces, harness morphable 3D face models (3DMM) to allow fine-grained control capabilities in GANs.
Unlike these methods, our control is not constrained to 3DMM parameters and is extendable beyond the domain of human faces.
Using contrastive learning, we obtain GANs with an explicitly disentangled latent space.
This disentanglement is utilized to train control-encoders mapping human-interpretable inputs to suitable latent vectors, thus allowing explicit control.
In the domain of human faces we demonstrate control over identity, age, pose, expression, hair color and illumination.
We also demonstrate control capabilities of our framework in the domains of painted portraits and dog image generation.
We demonstrate that our approach achieves state-of-the-art performance both qualitatively and quantitatively.
Explicitly controlling face attributes as illumination, pose, expression, hair color and age:
Explicitly controlling painting attributes as pose, expression, and age:
Changing the artistic style of paintings while maintaining all other attributes:
Explicitly controlling the pose of generated images of dogs:
Download the trained GAN and save it in resources/gan_models
.
Examples on how to explicitly and implicitly control the GAN's generation can be found in notebooks/gan_control_inference_example.ipynb
.
Examples include:
- Explicitly controlling pose.
- Explicitly controlling age.
- Explicitly controlling hair color.
- Explicitly controlling illumination.
- Explicitly controlling expression.
- Accessing and implicitly modifying the GAN's latent space.
The training process consists of two phases:
- Training a disentangled GAN.
- Training control/attribute encoders:
- Constructing a {control/attribute : w latent} dataset.
- Training control encoders.
- Use one of the configs in
src/gan_control/configs
:ffhq.json
for faces,metfaces.json
for paintings andafhq.json
for dogs. - In the config, edit
data_config.path
to point to your dataset directory. - Prepare the pretrained predictors (see: Prepare pretrained predictors) and save them in
src/gan_control/pretrained_models
. - Download the inception statistics (for FID calculations) and save them in
src/gan_control/inception_stat
. - Run
python src/gan_control/train_generator.py --config_path <your config>
.
Training results will be saved in <results_dir (given in the config file)>/<save_name (given in the config file)>_<some hyper parms>_<time>
.
This phase was trained on 4 Nvidia V100 GPUs with a batch size of 16 (batch of 4 per GPU).
Run python src/gan_control/make_attributes_df.py --model_dir <path to the GAN directory from phase 1> --save_path <dir where the dataset will be saved>/<dataset name>.pkl
.
Dataset will be saved in a form of a Dataframe in save_path
.
For each attribute you want to control:
- Edit the corresponding config from
src/gan_control/configs/controller_configs
.- In:
generator_dir
write the path to your GAN directory from phase 1. - In:
sampled_df_path
write the path to the {control/attribute : w latent} dataset (path to Dataframe).
- In:
- Run:
python src/gan_control/train_controller.py --config_path <your config from 1>
.
Your GAN will be saved in <"results_dir" in config>/<"save_name" in config>
.
This phase was trained on 1 Nvidia V100 GPU with a batch size of 128.
For faster training, you can follow Rosinality: StyleGAN 2 in PyTorch and add custom CUDA kernels, similar to here, to line 18 in gan_model.py
and set FUSED = True
in line 15.
This work supports the following datasets:
Following are instructions to download and prepare the predictors used for running our code:
- ArcFace (ID): Download
model_ir_se50.pth
from InsightFace_Pytorch. - Hopenet (Pose): Download
hopenet_robust_alpha1.pkl
from deep-head-pose. - ESR (Expression): Download the directory named
esr_9
from Efficient Facial Feature Learning and save it as is insrc/gan_control/pretrained_models
. - R-Net (Illumination): Download the pytorch R-Net model from here. This model is converted to pytorch from the tensorflow model published by Deep3DFaceReconstruction.
- PSPNet (Hair segmentation for hair color): Download
pspnet_resnet101_sgd_lr_0.002_epoch_100_test_iou_0.918.pth
from pytorch-hair-segmentation. - DogFaceNet (Dog ID): Download the pytorch DogFaceNet model from here. This model is converted to pytorch from the tensorflow model published by DogFaceNet.
- DEX (Age):
Please consider citing our work if you find it useful for your research:
@InProceedings{Shoshan_2021_ICCV,
author = {Shoshan, Alon and Bhonker, Nadav and Kviatkovsky, Igor and Medioni, G\'erard},
title = {GAN-Control: Explicitly Controllable GANs},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
}
This code is heavily borrowed from Rosinality: StyleGAN 2 in PyTorch.
This code uses the following models:
- ArcFace (ID): InsightFace_Pytorch
- Hopenet (Pose): deep-head-pose
- ESR (Expression): Efficient Facial Feature Learning
- R-Net (Illumination): Deep3DFaceReconstruction
- DEX (Age): IMDB-WIKI
- PSPNet (Hair segmentation for hair color): pytorch-hair-segmentation
- DogFaceNet (Dog ID): DogFaceNet
This code uses face-alignment for face alignment.
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.