This repository has an objective to implement Deep Feature Consisten Variational Autoencoder (DFC-VAE) according to Deep Feature Consistent Variational Autoencoder. Tensorflow and Python3 are used for development, and pre-trained VGG16 is adapted from VGG in TensorFlow. The training data is CelebA dataset.
To understand this following note, I would recommend to know the concept of Variational Autoencoder and generative model.
Figure 3: Interpolated image
It is known that one major problem of plain Variational Autoencoder (Plain-VAE) is that images generated by the model are blurry. This is because the plain model's loss function is defined by pixel-wise comparison between input images and generated images. As a consequence, optimizing model to achieve a great performance is difficult because slightly shifting or distorting those images can result in a very high loss. In other words, even the images have just slight difference in human eyes, computer treats that a big difference!
However, with DFC-VAE, the model leverages perceptual loss used in Neural Style Transfer. With regard to this paper, internal representations of convolutional neural networks could capture a content of the input image. This finding leads to the concept of perceptual loss, which compares the content - hidden representation - between images as oppose to calculate euclidean distant among pixels.
The solution contains four files
File Name | Description |
---|---|
dfc_vae_model.py | builds the VAE model, including encoder,decoder, VGG, loss function, and optimizer |
train_dfc_vae.py | trains the DFC_VAE model, and tests interpolation |
vgg16.py | builds the pre-trained VGG16 model |
util.py | contains supporting functions, such as data-preprocessing |
- Download pre-trained VGG weights from VGG in TensorFlow
- Download CelebA dataset from CelebA dataset
- Compress data in Zip
- Process images (crop and resize) and convert them to TFRecord format (refer to write_tfrecord() function in util.py)
- Run train_dfc_vae.py
- scipy.misc
- zipfile (used for reading content inside Zip file)
- imageio (used for generating .gif file)
- Beta value is extremely significant. You need to adjust the value to make sure the model produce a great result
- Save file in .png format for a better quality image