Generalized VAE with PixelCNN Decoder

This repo implements the methods described in Towards a Deeper Understanding of Variational Autoencoding Models. A VAE with powerful decoding family such as PixelCNN tend to ignore latent code, and use only the decoding distribution to represent the entire dataset. This paper showed that this phenomenon is not general. For a more general family of VAE models, there are members that prefer to use the latent code. In particular, without any regularization on the posterior the model will prefer to use latent code. Furthermore we can still obtain correct samples, albeit only through a Markov chain.

Samples generated by model without regularization

Samples generated by model with ELBO regularization

Training with Default Options

Setup

Make sure you have the following installed

python 2 or 3 with numpy and scipy
tensorflow (Tested on tensorflow 0.12)

Train on CIFAR

To train on CIFAR with ELBO regularization

python train.py --use_autoencoder --save_dir=elbo --reg_type=elbo --gpus=0,1,2,3

To train on CIFAR without regularization

python train.py --use_autoencoder --save_dir=no_reg --reg_type=no_reg --gpus=0,1,2,3

You must replace the --gpus=[ids] to id of GPUs that are present in your system.

Additional Options

To use a particular GPU/GPUs add option --gpus=[ids] such as --gpus=0,1 to use GPU 0 and 1. Using Multi-GPUs is recommended
To specify batch size use --batch_size=[size]
To specify dimension of latent code use --latent_dim=[dim]
To specify the directory to place all checkpoint, logs and visualizations use --save_dir=/path/to/folder. To visualize with tensorboard set this directory as the logdir.
To use checkpoint file if one exists in model directory, use --load_params
For more options and their meaning please refer to the original PixelCNN++

ermongroup/Generalized-PixelVAE

Generalized VAE with PixelCNN Decoder

Training with Default Options

Setup

Train on CIFAR

Additional Options