VisionTransformer: A Python repository from Blaise143

In this repository, I Implement a vision transformer inspired by the paper An Image is Worth 16x16 Words.

It trains on the MNIST Dataset but can easily be adapted to any other custom dataset.

Project Structure

├─src
  ├── HelperFunctions.py
  ├── PatchEmbedding.py
  ├── Train.py
  ├── TransformerBlock.py
  ├── VisionTransformer.py
├─ main.py
├─README.md

Blaise143/VisionTransformer