CF-CLIP (Towards Counterfactual Image Manipulation via CLIP)

This repository is an official PyTorch implementation of the ACM MM 2022 paper "Towards Counterfactual Image Manipulation via CLIP".

Setup

The code relies on the official implementation of CLIP, and the Rosinality pytorch implementation of StyleGAN2.

Requirements

For all the methods described in the paper, is it required to have:

Anaconda
CLIP

Specific requirements for each method are described in its section. To install CLIP please run the following commands:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git

Pretrained Models

Please download the following pertrained models and place them in ./pretrained folder.

StyleGAN

For AFHQ Dog and Cat, we can convert the tensorflow version pretrained model to pytorch version using convert_weight.py.

Face Recognition & VGG

Latent Codes

Inverted CelebA-HQ via e4e:

Random Sample

For cat and dog, we randomly sample w code (1*512) using GetCode.py, which uses the tensorflow version pretrained model. ('.pkl'). In this case, we need to set w_space option of training script to True.

Usage

Pretrained Models

We provided pretrained models for different face, AFHQ Dog and Cat cases in our paper here. You may put them under folder pretrained after downloading.

Training

The main training script is placed in mapper/scripts/train.py.
Training arguments can be found at mapper/options/train_options.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs. Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs. Note that
To resume a training, please provide --checkpoint_path.
--description is where you provide the driving text.

Example for training a mapper for the green lipstick:

cd mapper
python scripts/train.py --exp_dir ../results/green_lipstick --description "green lipstick"

You may refer train.sh for the example of training AFHQ Dog/Cat cases.

Inference

The main inferece script is placed in mapper/scripts/inference.py.
Inference arguments can be found at mapper/options/test_options.py.
Adding the flag --couple_outputs will save image containing the input and output images side-by-side.

You may refer test.sh for reference.

Citation

If you find CF-CLIP useful or inspiring, please consider citing:

@inproceedings{yu2022-CFCLIP,
  title       = {Towards Counterfactual Image Manipulation via CLIP},
  author      = {Yu, Yingchen and Zhan, Fangneng and Wu, Rongliang and Zhang, Jiahui and Lu, Shijian and Cui, Miaomiao and Xie, Xuansong and Hua, Xian-Sheng and Miao, Chunyan},
  booktitle   = {Proceedings of the 30th ACM International Conference on Multimedia},
  year        = {2022}
}

Acknowledgments

This code borrows heavily from StyleCLIP, StyleGAN-NADA and InfoNCE, we apprecite the authors for sharing their codes.

yingchen001/CF-CLIP

CF-CLIP (Towards Counterfactual Image Manipulation via CLIP)

Setup

Requirements

Pretrained Models

StyleGAN

Face Recognition & VGG

Latent Codes

Inverted CelebA-HQ via e4e:

Random Sample

Usage

Pretrained Models

Training

Inference

Citation

Acknowledgments