/vqgawn

Implementation of VQGAN from scratch of phase-1

Primary LanguagePython

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

VQGAWN

VQGAN implementation from scratch using pytorch!
Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Built With
  3. Getting Started
  4. Results
  5. Roadmap
  6. Credits

About The Project

VQGAN, short for Vector Quantized Generative Adversarial Network, is a powerful deep learning architecture used in the field of image generation. It combines elements of both generative adversarial networks (GANs) and vector quantization to create high-quality, diverse, and controllable images. VQGAN utilizes a discrete latent space to represent image features, allowing for efficient and expressive encoding of visual information.

Built With

Getting Started

Install all the libraries

pip install pytorch tqdm numpy albumentations matplotlib

Change the arguments value like dataset,batch size,etc in training_vqgan.py file and run that file

Results

0 Epochs

Logo

100 Epochs

Logo

Roadmap

  • Implementing 1st phase of the paper
  • Implementing transformer architecture for 2nd phase
  • Adding another vqgan model with prompt or masked images

See the open issues for a full list of proposed features (and known issues).

Credits

Citation

@misc{esser2021taming,
      title={Taming Transformers for High-Resolution Image Synthesis}, 
      author={Patrick Esser and Robin Rombach and Björn Ommer},
      year={2021},
      eprint={2012.09841},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}