/Nano-transformer

Minimal encoder for text classification, decoder for text generation, ViT for image classification

Primary LanguageJupyter NotebookMIT LicenseMIT

Minimal implementation of transformer-encoder for text classification, transformer-decoder for text generation, and ViT for image classification (diffusion transformer for image generation is in this repo.)

The .py file contains codes for:

  • Word, character, and BPE tokenizers as well as for vocabulary generation,
  • Text generation and text classification datasets,
  • Text and image embeddings,
  • Encoder, decoder, and ViT models, with shared modules as much as possible,
  • Training and evaluation, common for all three tasks.

.ipynb files minimally illustrate the training and evaluation of models by using toy datasets (including MNIST for ViT) and small transformers. However, the code in .py file should allow training large models on serious datasets as well.

References: