Cascaded Refinement Networks

It synthesizes photorealistic images (deepfakes) without a generative adversarial network. This is an implementation of the convolutional neural network described in "Photographic Image Synthesis with Cascaded Refinement Networks" by Qifeng Chen and Vladlen Koltun. There are some differences between their implementation and this one. You may find more information at their website.

Required Python Libraries

Tensorflow
Keras
OpenCV
Pillow
Numpy
h5py
Python 3

Dataset

Please download the dataset from Cityscape. We used gtFine_trainvaltest (labels) and leftImg8bit_trainvaltest (data).

Link: https://www.cityscapes-dataset.com/downloads/

Quick Start

Clone this repository.
Download the dataset from Cityscape.
Prepare a save file to begin training by using the prepvgg and then prepcrn subcommands.
Then train by using the train subcommand.
To synthesize images, use the generate subcommand after training.
Run python3 crn.py --help for more information.

Warning

Running this neural network requires a substantial amount of memory. Training the network in 256p requires at least 40 GB for a batch size of 1. Training in 1024p requires at least 120 GB for a batch size of 5.

256p is enabled. To use the code for 512p and 1024p, uncomment the extra modules.

Differences

Uses batch normalization instead of layer normalization.
Uses an earlier version of their loss function.
Uses max pooling instead of bilinear subsampling.

Reference

Qifeng Chen and Vladlen Koltun. Photographic Image Synthesis with Cascaded Refinement Networks. In ICCV 2017.

johnathanlouie/cascaded-refinement-network