Learn Transformers (work-in-progress)

When I was growing up Transformers were cars that turned into robots.

Now they're the backbone of every machine learning and AI app.

The goal of this repo will be to learn (for myself) and provide simple resources for others on*:

The attention mechanism and the original Transformer architecture.
Various Transformer-based models (e.g. GPT).
The transformers library by Hugging Face (many different types of models here but why not?).

1 & 2 will be more research focused where as 3 will be very practically applicable.

*Outline subject to change.

Prerequsites

Assumes basic knowledge of PyTorch (or any other ML framework) and deep learning in general.

See learnpytorch.io for a beginner-friendly intro.

Or my Learn PyTorch in a day video on YouTube to get up to speed and then come back here.

Some of the resources I've found useful (this will grow overtime).

Modifications to original Transformer architecture (warning: there are lots) - https://arxiv.org/abs/2102.11972