/CycleGAN

Software that generates photos from paintings, turns horses into zebras, performs style transfer, and more (from UC Berkeley)

Primary LanguageLuaOtherNOASSERTION

CycleGAN

Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for example:

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Jun-Yan Zhu*, Taesung Park*, Phillip Isola, Alexei A. Efros
Berkeley AI Research Lab, UC Berkeley
In arxiv, 2017. (* equal contributions)

This package includes CycleGAN, pix2pix, as well as other methods like BiGAN/ALI and Apple's paper S+U learning.
PyTorch version is coming soon (by April 7th).

Applications

Monet Paintings to Photos

Collection Style Transfer

Object Transfiguration

Season Transfer

Photo Enhancement: iPhone photo to DSLR photo

Setup

Prerequisites

  • Linux or OSX
  • NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN may work with minimal modification, but untested)

Getting Started

luarocks install nngraph
luarocks install class
luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
  • Clone this repo:
git clone https://github.com/junyanz/CycleGAN
cd CycleGAN
  • Download the dataset (e.g. zebra and horse images from ImageNet):
bash ./datasets/download_dataset.sh horse2zebra
  • Train the model
DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model th train.lua
  • (CPU only) The same training command without using a GPU or CUDNN. Setting the environment variables gpu=0 cudnn=0 forces CPU only
DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model gpu=0 cudnn=0 th train.lua
  • (Optionally) start the display server to view results as the model trains. (See Display UI for more details):
th -ldisplay.start 8000 0.0.0.0
  • Finally, test the model:
DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model phase=test th test.lua

The test results will be saved to a html file here: ./results/horse2zebra_model/latest_test/index.html.

Train

DATA_ROOT=/path/to/data/ name=expt_name th train.lua

Models are saved to ./checkpoints/expt_name (can be changed by passing checkpoint_dir=your_dir in train.lua).

See opt_train in options.lua for additional training options.

Test

DATA_ROOT=/path/to/data/ name=expt_name which_direction='AtoB' phase=test th test.lua

This will run the model named expt_name in both directions on all images in /path/to/data/testA and /path/to/data/testB. If which_direction is 'BtoA', the two sets A and B of the datasets are flipped.

Result images, and a webpage to view them, are saved to ./results/expt_name (can be changed by passing results_dir=your_dir in test.lua).

See opt_test in options.lua for additional testing options.

Datasets

Download the datasets using the following script:

bash ./datasets/download_dataset.sh dataset_name
  • cityscapes: 2975 images from the Cityscapes training set.
  • maps: 1096 training images scraped from Google Maps.
  • horse2zebra: 939 horse images and 1177 zebra images downloaded from ImageNet using keywords wild horse and zebra
  • apple2orange: 996 apple images and 1020 orange images downloaded from ImageNet using keywords apple and navel orange.
  • summer2winter_yosemite: 1273 summer Yosemite images and 854 winter Yosemite images were downloaded using Flickr API. See more details in our paper.
  • monet2photo, vangogh2photo, ukiyoe2photo, cezanne2photo: The art images were downloaded from Wikiart. The real photos are downloaded from Flickr using combination of tags landscape and landscapephotography. The training set size of each class is Monet:1074, Cezanne:584, Van Gogh:401, Ukiyo-e:1433, Photographs:6853.
  • iphone2dslr_flower: both classe of images were downlaoded from Flickr. The training set size of each class is iPhone:1813, DSLR:3316. See more details in our paper.

Pre-trained Models

Download the pre-trained models with the following script. You need to rename the model (e.g. orange2apple to /checkpoints/orange2apple/latest_net_G.t7) after the download has finished.

bash ./models/download_model.sh model_name
  • orange2apple (orange -> apple) and apple2orange: trained on the CMP Facades dataset.
  • horse2zebra (horse -> zebra) and zebra2horse (zebra -> horse): trained on the CMP Facades dataset.
  • style_monet (landscape photo -> Monet painting style), style_vangogh (landscape photo -> Van Gogh painting style), style_ukiyoe (landscape photo -> Ukiyo-e painting style), style_cezanne (landscape photo -> Cezanne painting style): trained on paintings and Flickr landscape photos.
  • monet2photo (Monet paintings -> real landscape): trained on paintings and Flickr landscape photos.
  • cityscapes_photo2label (street scene -> label) and cityscapes_label2photo (label -> street scene): trained on the Cityscapes dataset.
  • map2sat (map -> aerial photo) and sat2map (aerial photo -> map): trained on Google maps.
  • iphone2dslr_flower (iPhone photos of flower -> DSLR photos of flower): trained on Flickr photos.

For example, to generate Ukiyo-e style images using the pre-trained model,

bash ./datasets/download_dataset.ukiyoe2photo
bash ./models/download_model.sh style_ukiyoe
mkdir ./checkpoints/ukiyoe2photo_pretrained
mv ./models/style_ukiyoe.t7 ./checkpoints/ukiyoe2photo_pretrained/latest_net_G.t7
DATA_ROOT=./datasets/ukiyoe2photo name=ukiyoe2photo_pretrained which_direction='BtoA' model=one_direction_test phase=test th test.lua

Please pay attention to the direction. which_direction='BtoA' was used because the pretrained network transforms photos to Ukiyo-e-style images, but the dataset ukiyoe2photo is from Ukiyo-e paintings to photos. model=one_direction_test loads the code that generates outputs of the trained network in only one direction.

Display UI

Optionally, for displaying images during training and test, use the display package.

  • Install it with: luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
  • Then start the server with: th -ldisplay.start
  • Open this URL in your browser: http://localhost:8000

By default, the server listens on localhost. Pass 0.0.0.0 to allow external connections on any interface:

th -ldisplay.start 8000 0.0.0.0

Then open http://(hostname):(port)/ in your browser to load the remote desktop.

Citation

If you use this code for your research, please cite our paper:

@article{CycleGAN2017,
  title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  journal={arXiv preprint arXiv:1703.10593},
  year={2017}
}

Related Projects:

pix2pix: Image-to-image translation using conditional adversarial nets
iGAN: Interactive Image Generation via Generative Adversarial Networks

Cat Paper Collection

If you love cats, and love reading cool graphics, vision, and learning papers, please check out the Cat Paper Collection:
[Github] [Webpage]

Acknowledgments

Code borrows from pix2pix and DCGAN. The data loader is modified from DCGAN and Context-Encoder. The generative network is adopted from neural-style with Instance Normalization.