Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

Fast artistic style transfer by using feed forward network.

input image size: 1024x768
process time(CPU): 17.78sec (Core i7-5930K)
process time(GPU): 0.994sec (GPU TitanX)

Differences from original

default --image_size set to 512 (original uses 256)
ability to switch off dataset cropping with --fullsize option. Crops by default, preserving aspect ratio
cropping implementation uses ImageOps.fit, which always scales and crops, whereas original uses custom solution, which upscales the image if it's smaller than --image_size, otherwise just crops without scaling
bicubic and Lanczos resampling when scaling dataset and input style images accordingly provides sharper shrinking, whereas original uses nearest neighbour

Video Processing

The repo includes a bash script to transform your videos. It depends on ffmpeg. Compilation instructions

./genvid.sh input_video output_video model start_time duration

The first three arguments are mandatory and should contain path to files.
The last two are optional and indicate starting position and duration in seconds.

Requirement

Chainer

$ pip install chainer

Prerequisite

Download VGG16 model and convert it into smaller file so that we use only the convolutional layers which are 10% of the entire model.

sh setup_model.sh

Train

Need to train one image transformation network model per one style target. According to the paper, the models are trained on the Microsoft COCO dataset.

python train.py -s <style_image_path> -d <training_dataset_path> -g 0

Generate

python generate.py <input_image_path> -m <model_path> -o <output_image_path>

This repo has pretrained models as an example.

example:

python generate.py sample_images/tubingen.jpg -m models/composition.model -o sample_images/output.jpg

python generate.py sample_images/tubingen.jpg -m models/seurat.model -o sample_images/output.jpg

Difference from paper

Convolution kernel size 4 instead of 3.
Training with batchsize(n>=2) causes unstable result.

No Backward Compatibility

Jul. 19, 2016

This version is not compatible with the previous versions. You can't use models trained by the previous implementation. Sorry for the inconvenience!

License

MIT

Reference

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Codes written in this repository based on following nice works, thanks to the author.

chainer-gogh Chainer implementation of neural-style. I heavily referenced it.
chainer-cifar10 Residual block implementation is referred.

ttoinou/chainer-fast-neuralstyle