StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)

See you @ Siggraph 2021

Figure: Sequential edits using StyleFlow

High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) attributes, while still preserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this paper, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces, we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow to other concurrent works.

StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
Rameen Abdal, Peihao Zhu, Niloy Mitra, Peter Wonka
KAUST, Adobe Research

[Paper] [Project Page] [Demo] [Promotional Video]

Installation

Clone this repo.

git clone https://github.com/RameenAbdal/StyleFlow.git
cd StyleFlow/

This code requires PyTorch, TensorFlow, Torchdiffeq, Python 3+ and Pyqt5. Please install dependencies by

conda env create -f environment.yml

StyleGAN2 relies on custom TensorFlow ops that are compiled on the fly using NVCC. To correctly setup the StyleGAN2 generator follow the Requirements in this repo.

Installation (Docker)

Clone this repo.

git clone https://github.com/RameenAbdal/StyleFlow.git
cd StyleFlow/

You must have CUDA (>=10.0 && <11.0) and nvidia-docker2 installed first !

Then, run :

xhost +local:docker # Letting Docker access X server
wget -P stylegan/ http://d36zk2xti64re0.cloudfront.net/stylegan2/networks/stylegan2-ffhq-config-f.pkl
docker-compose up --build # Expect some time before UI appears

When finished, run :

xhost -local:docker

UI Illustration

Loading images may take 2 - 3 seconds on the first click. Move the slider smoothly to render results.

Editing Images Using Pretrained Models

Run the main UI
```
python main.py
```
Run the Attribute Transfer UI
```
python main_attribute.py 
```

Web UI (Beta)

A web based UI is also now available. Follow webui dev branch for setup.

Training New Model

Dataset containing sampled StyleGAN2 latents, lighting SH parameters and other attributes. (Download Here)

Create ./data_numpy/ in the main folder and extract the above data or create your own dataset.

Train your model:

   python train_flow.py

Projection

Our new projection method is currently under review. To be updated! Follow the repo for updates : https://github.com/ZPdesu/II2S

License

Citation

If you use this research/codebase/dataset, please cite our papers.

@article{10.1145/3447648,
author = {Abdal, Rameen and Zhu, Peihao and Mitra, Niloy J. and Wonka, Peter},
title = {StyleFlow: Attribute-Conditioned Exploration of StyleGAN-Generated Images Using Conditional Continuous Normalizing Flows},
year = {2021},
issue_date = {May 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {40},
number = {3},
issn = {0730-0301},
url = {https://doi.org/10.1145/3447648},
doi = {10.1145/3447648},
abstract = {High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) attributes while stillpreserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this article, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces, we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow over prior and several concurrent works. Project Page and Video: https://rameenabdal.github.io/StyleFlow.},
journal = {ACM Trans. Graph.},
month = may,
articleno = {21},
numpages = {21},
keywords = {image editing, Generative adversarial networks}
}

@INPROCEEDINGS{9008515,
  author={Abdal, Rameen and Qin, Yipeng and Wonka, Peter},
  booktitle={2019 IEEE/CVF International Conference on Computer Vision (ICCV)}, 
  title={Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?}, 
  year={2019},
  volume={},
  number={},
  pages={4431-4440},
  doi={10.1109/ICCV.2019.00453}}

Broader Impact

Important : Deep learning based facial imagery like DeepFakes and GAN generated images can be gravely misused. This can spread misinformation and lead to other offences. The intent of our work is not to promote such practices but instead be used in the areas such as identification (novel views of a subject, occlusion inpainting etc. ), security (facial composites etc.), image compression (high quality video conferencing at lower bitrates etc.) and development of algorithms for detecting DeepFakes.

Acknowledgments

This implementation builds upon the awesome work done by Karras et al. (StyleGAN2), Chen et al. (torchdiffeq) and Yang et al. (PointFlow). This work was supported by Adobe Research and KAUST Office of Sponsored Research (OSR).

jiangwqcooler/StyleFlow