Samples of Male to Female (Celeba-HQ), Wildlife to Cat (AFHQ), and Cat to Dog (AFHQ) translations obtained with UVCGANv2
This package provides reference implementation of the UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation
paper.
uvcgan2
builds upon the CycleGAN method for unpaired image-to-image transfer
and improves its performance by modifying the generator, discriminator, and the
training procedure.
This README file provides brief instructions about how to set up the uvcgan2
package and reproduce the paper results. To further facilitate the
reproducibility we share the pre-trained models
(c.f. section Pre-trained models)
The code of uvcgan2
is based on pytorch-CycleGAN-and-pix2pix
and uvcgan. Please refer to the LICENSE section for the proper
copyright attribution.
UPDATE (2023-09-22): Changed the arxiv preprint title:
- from:
"Rethinking CycleGAN: Improving Quality of GANs for Unpaired Image-to-Image Translation" - to: "UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation Overview"
This README file mainly describes the reproduction of the Rethinking CycleGAN
paper results. If you would like to apply the uvcgan2
to
some other dataset, please check out our accompanying repository
uvcgan4slats. It describes an application of uvcgan
to a
generic scientific dataset.
In short, the procedure to adapt the uvcgan2
to your problem is as follows:
- Arrange your dataset to the format, similar to CelebA-HQ and AFHQ. For reference, the format of the CelebA-HQ directory is:
CelebA-HQ/ # Name of the dataset
train/
male/ # Name of the first domain
female/ # Name of the second domain
val/
male/
female/
where the directories named male/
and female/
store the corresponding
images. Arrange your dataset into a similar form, but choose appropriate
names for the dataset directory and data domains.
- Next, take an existing training script as a starting point. For instance, this one should work
scripts/celeba_hq/train_m2f_translation.py
The script contains a training configuration in the args_dict
dictionary. The dictionary format should be rather self-explanatory.
Modify the following parameters of the args_dict
:
- Modify
data
configuration to match your dataset. - Modify
outdir
parameter and set it to the path, where you want the output to be saved. - Modify
transfer
parameter and set it toNone
. Alternatively, check our uvcgan4slats repository, if you want to pretrain the generators on a pretext task.
- Use the instructions below to perform the model evaluation.
uvcgan2
models were trained under the official pytorch
container
pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime
. A similar training
environment can be constructed with conda
conda env create -f contrib/conda_env.yaml
The created conda environment can be activated with
conda activate uvcgan2
To install the uvcgan2
package one can simply run the following command
python3 setup.py develop --user
from the uvcgan2
source tree.
By default, uvcgan2
will try to read datasets from the ./data
directory
and will save trained models under the ./outdir
directory. If you would
like to change this default behavior, set the two environment variables
UVCGAN2_DATA
and UVCGAN2_OUTDIR
to the desired paths.
For instance, on UNIX-like system (Linux, MacOS) these variables can be set with:
export UVCGAN2_DATA=PATH_WHERE_DATA_IS_SAVED
export UVCGAN2_OUTDIR=PATH_TO_SAVE_MODELS_TO
To reproduce the results of the paper, the following workflow is suggested:
- Download datasets (
selfie2anime
,celeba
,celeba_hq
,afhq
). - Pre-process high-quality datasets.
- Pre-train generators on an Inpainting pretext task.
- Train CycleGAN models.
- Generate translated images and evaluate KID/FID scores.
We provide pre-trained generators that were used to obtain the Rethinking CycleGAN
paper results.
They can be found on Zenodo.
uvcgan2
supplies a script ./scripts/download_model.sh
to download
the pre-trained models, e.g.
./scripts/download_model.sh afhq_cat2dog
The downloaded models will be unpacked under the ${UVCGAN_OUTDIR}
with the default path as ./outdir
.
uvcgan2
provides a script (scripts/download_dataset.sh
) to download and
unpack various CycleGAN datasets.
NOTE: As of June 2023, the CelebA datasets (male2female
and glasses
)
need to be recreated manually. Please refer to
celeba4cyclegan for instructions
on how to do that.
For example, one can use the following commands to download selfie2anime
,
CelebA male2female
, CelebA eyeglasses
, CelebA-HQ
, and AFHQ
datasets:
./scripts/download_dataset.sh selfie2anime
./scripts/download_dataset.sh male2female
./scripts/download_dataset.sh glasses
./scripts/download_dataset.sh celeba_all # Low-resolution CelebA
./scripts/download_dataset.sh celeba_hq
./scripts/download_dataset.sh afhq
The downloaded datasets will be unpacked under the UVCGAN2_DATA
directory
(or ./data
if UVCGAN2_DATA
is unset).
The images of the high-quality datasets CelebA-HQ
and AFHQ
have sizes
of 1024x1024 and 512x512 pixels correspondingly. For the training and
evaluation, however, we have relied on images of size 256x256. The script
scripts/downsize_right.py
can be used to properly resize the images:
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/afhq/" "${UVCGAN2_DATA:-./data}/afhq_resized_lanczos"
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/celeba_hq/" "${UVCGAN2_DATA:-./data}/celeba_hq_resized_lanczos"
Once the datasets are ready, the next step is to pre-train generators on the
Inpainting pretext task. uvcgan2
provides pre-training scripts for all
the datasets:
scripts/afhq/pretrain_afhq.py
scripts/anime2selfie/pretrain_anime2selfie.py
scripts/celeba/pretrain_celeba.py
scripts/celeba_hq/pretrain_celebahq.py
These scripts can be simply run like
python3 scripts/afhq/pretrain_afhq.py
Optionally, they accept some command line arguments. For instance, the batch size can be adjusted by:
python3 scripts/afhq/pretrain_afhq.py --batch-size 8
More details can be found by looking over the scripts. Each of them contains a training configuration, which should be self-explanatory.
When the training is finished, the pre-trained generators will be saved under
the ${UVCGAN2_OUTDIR}
directory.
For each of the translation directions, we provide a corresponding image translation training script:
scripts/afhq/train_cat2dog_translation.py
scripts/afhq/train_wild2cat_translation.py
scripts/afhq/train_wild2dog_translation.py
scripts/anime2selfie/train_anime2selfie_translation.py
scripts/celeba/train_celeba_glasses_translation.py
scripts/celeba/train_celeba_male2female_translation.py
scripts/celeba_hq/train_m2f_translation.py
Similar to the pre-training scripts, they can be simply run by
python3 scripts/afhq/train_cat2dog_translation.py
The trained models will be saved under the "${UVCGAN_OUTDIR}" directory.
uvcgan2
provides a script scripts/translate_images.py
to perform a batch
translation of the images via one of the trained models. The script can
be run as
python3 scripts/translate_images.py PATH_TO_TRAINED_MODEL --split SPLIT
where SPLIT is the split (train
, val
or test
) of the data to translate.
Due to how the datasets are constructed, one should use test
split for the
anime2selfie
and CelebA
datasets, and val
split for the CelebA-HQ
and AFHQ
datasets.
The translated images will be saved under
PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT
.
Rethinking CycleGAN
paper describes two ways to evaluate the quality of
translation:
- Consistent protocol. Uniform across all datasets.
- Ad-hoc protocols for
CelebA-HQ
andAFHQ
.
The consistent evaluation protocol relies on torch_fidelity (commit 5f7c5b5ccc4128bd79be2fdd8e75f118aa8fdc7c) to calculate KID/FID metrics of the translated images.
A helper script scripts/eval_fid.py
is provided to facilitate such
a calculation. It can be run with
python3 scripts/eval_fid.py `PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT` --kid-size KID_SIZE
where KID_SIZE
is the parameter of the KID calculation algorithm. Its value
depends on the dataset and should be set to match the Rethinking CycleGAN
paper (c.f. Section 5.2 and Appendix E).
At the end of the calculation, the scores will be saved in the following file:
PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT/fid_metrics.csv
Please refer to our Benchmarking repository for the additional details on how the consistent evaluation protocol was applied to the earlier GAN-based models.
An alternative way to evaluate uvcgan2
models is to rely on various
ad-hoc protocols found in the wild. In the paper, we have used two such
protocols for the CelebA-HQ
and AFHQ
datasets. For consistency with
previous works, we have used EGSDE's implementation of these
protocols.
The EGSDE's evaluation code can be invoked by running the run_score.py
script. The script needs to be manually modified for each translation
direction, but the modifications are straightforward.
An important variable of the run_score.py
script is translate_path
that
should be set to point out to the location of the translated images.
Note, however, that the uvcgan2
changes names of the translated images from
their original, semi-random, values to sample_1.png
, sample_2.png
, etc.
The indices correspond to the lexicographically sorted original names.
Before providing the translated images to the run_score.py
script, they
should be renamed back to the original names.
Finally, uvcgan2
provides a script scripts/eval_il2_scores.py
to batch
evaluate faithfulness scores based on the Inception-v3 L2 distances. Its
invocation is similar to the scripts/eval_fid.py
from the section 5.2.1.
Selfie2Anime and Anime2Selfie (pdf)
Gender Swap on the CelebA dataset (pdf)
Removing and Adding Glasses on the CelebA dataset (pdf)
Cat2Dog on the AFHQ dataset (pdf)
Wild2Dog on the AFHQ dataset (pdf)
Wild2Cat on the AFHQ dataset (pdf)
Male2Female on the CelebA-HQ dataset (pdf)
You can specify GPUs that pytorch
will use with the help of the
CUDA_VISIBLE_DEVICES
environment variable. This variable can be set to a list
of comma-separated GPU indices. When it is set, pytorch
will only use GPUs
whose IDs are in the CUDA_VISIBLE_DEVICES
.
uvcgan2
is distributed under BSD-2
license.
uvcgan2
repository contains some code (primarily in uvcgan2/base
subdirectory) from pytorch-CycleGAN-and-pix2pix.
This code is also licensed under BSD-2
license (please refer to
uvcgan2/base/LICENSE
for details).
Each code snippet that was taken from pytorch-CycleGAN-and-pix2pix has a note about proper copyright attribution.