This purpose of this repository is to act as an entry-point to the world of transformer-based language modelling in PyTorch. It is pitched to those of us that want to understand implementation details and grasp theoretical insights, without having to wade through badly written research code 🙂
All code has been structured into a Python package called modelling
, that is organised as follows:
└── src
├── modelling
│ ├── data.py
│ ├── rnn.py
│ ├── transformer.py
│ └── utils.py
We have done our best to make this as readable as possible and comprehensively documented, so this is the place to go for the implementation details. To see this in action, use the following notebooks:
notebooks
├── 0_attention_and_transformers.ipynb
├── 1_datasets_and_dataloaders.ipynb
├── 2_text_generation_with_rnns.ipynb
└── 3_text_generation_with_transformers.ipynb
└── 4_pre_trained_transformers_for_search.ipynb
└── 5_pre_trained_transformers_for_sentiment_analysis.ipynb
These will guide you through steps required to use the code contaied within the modelling
package to train a language model and then use it to perform semantic search and sentiment classification tasks.
To run the notebooks and use the code within the src/modelling
directory either clone this repository and install the package directly from the source code,
pip install .
Or install it directly from this repository,
pip install git+https://github.com/AlexIoannides/transformers.git@main
An HTML (and PDF) presentation of this work is contained in the presentation_slides
directory.
We found the following useful in our ascent up the transformer and LLMs learning curve:
- The Annotated Transformer - Attention is all you Need, the paper that introduced the transformer architecture for sequence-to-sequence modelling, annotated with PyTorch code snippets that demonstrate how to implement the concepts from first principles.
- Transformers and Multi-Head Attention - comprehensive tutorial from Lightning AI that demonstrates how to compose and train a simple generative language model using the latest techniques for training transformer models.
- Language Modelling with
nn.Transformer
and torchtext - a tutorial from PyTorch that demonstrates how to use PyTorch's transformer layers to train a simple generative language model. - Transformer Architecture: The Positional Encoding - a deep-dive into positional encoding and its role transformer models.