This is a PyTorch implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2, including their PixelCNN and self-attention priors.
This is a work-in-progress. Here's my checklist:
- Implement Gated PixelCNN with conditioning
- Implement masked self-attention
- Test PixelCNN on MNIST
- Implement vector quantizing layer
- Implement VQ-VAE encoder/decoder
- Test VQ-VAE + PixelCNN on MNIST
- Implement hierarchical VQ-VAE
- Train hierarchical VQ-VAE on large images
- Train top PixelCNN on large images
- Train bottom PixelCNN on large images