/arbitrary_style_transfer

Fast Neural Style Transfer with Arbitrary Style using AdaIN Layer - Based on Huang et al. "Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization"

Primary LanguagePythonMIT LicenseMIT

Arbitrary-Style-Transfer

Arbitrary-Style-Per-Model Fast Neural Style Transfer Method

Description

Using an Encoder-AdaIN-Decoder architecture - Deep Convolutional Neural Network as a Style Transfer Network (STN) which can receive two arbitrary images as inputs (one as content, the other one as style) and output a generated image that recombines the content and spatial structure from the former and the style (color, texture) from the latter without re-training the network. The STN is trained using MS-COCO dataset (about 12.6GB) and WikiArt dataset (about 36GB).

This code is based on Huang et al. Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization (ICCV 2017)

stn_overview System overview. Picture comes from Huang et al. original paper. The encoder is a fixed VGG-19 (up to relu4_1) which is pre-trained on ImageNet dataset for image classification. We train the decoder to invert the AdaIN output from feature spaces back to the image spaces.

Prerequisites

Trained Model

You can download my trained model from here which is trained with style weight equal to 2.0
Or you can directly use download_trained_model.sh in the repo.

Manual

  • The main file main.py is a demo, which has already contained training procedure and inferring procedure (inferring means generating stylized images).
    You can switch these two procedures by changing the flag IS_TRAINING.
  • By default,
    (1) The content images lie in the folder "./images/content/"
    (2) The style images lie in the folder "./images/style/"
    (3) The weights file of the pre-trained VGG-19 lies in the current working directory. (See Prerequisites above. By the way, download_vgg19.sh already takes care of this.)
    (4) The MS-COCO images dataset for training lies in the folder "../MS_COCO/" (See Prerequisites above)
    (5) The WikiArt images dataset for training lies in the folder "../WikiArt/" (See Prerequisites above)
    (6) The checkpoint files of trained models lie in the folder "./models/" (You should create this folder manually before training.)
    (7) After inferring procedure, the stylized images will be generated and output to the folder "./outputs/"
  • For training, you should make sure (3), (4), (5) and (6) are prepared correctly.
  • For inferring, you should make sure (1), (2), (3) and (6) are prepared correctly.
  • Of course, you can organize all the files and folders as you want, and what you need to do is just modifying related parameters in the main.py file.

Results

style output (generated image)

My Running Environment

Hardware

  • CPU: Intel® Core™ i9-7900X (3.30GHz x 10 cores, 20 threads)
  • GPU: NVIDIA® Titan Xp (Architecture: Pascal, Frame buffer: 12GB)
  • Memory: 32GB DDR4

Operating System

  • ubuntu 16.04.03 LTS

Software

  • Python 3.6.2
  • NumPy 1.13.1
  • TensorFlow 1.3.0
  • SciPy 0.19.1
  • CUDA 8.0.61
  • cuDNN 6.0.21

References

  • The Encoder which is implemented with first few layers(up to relu4_1) of a pre-trained VGG-19 is based on Anish Athalye's vgg.py

Citation

  @misc{ye2017arbitrarystyletransfer,
    author = {Wengao Ye},
    title = {Arbitrary Style Transfer},
    year = {2017},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/elleryqueenhomels/arbitrary_style_transfer}}
  }