/transformers

Transformers from scratch in pytorch.

Primary LanguageJupyter Notebook

Transformers

This repository is dedicated to the implementation and exploration of Transformer models. It aims to provide an in-depth understanding of the architecture and its components by applying it to various use cases.

Introduction

The Transformer model has revolutionized natural language processing and beyond. This project aims to explore the model by breaking down and reconstructing its components, providing a hands-on experience with its mechanics.

Repository Structure

  • /models - This directory contains the actual Transformer model implementations.

    • /modules.py - A module that includes custom functions and classes essential for the Transformer model.
    • /transformer_blocks.py - Defines different types of Transformer blocks model.
  • /notebooks - Jupyter notebooks with detailed code and narrative explanations:

    • encoder_only_classification.ipynb - Illustrates how to apply the Transformer encoder for classification tasks.
    • decoder_only_gpt.ipynb- Exploration of language modeling with a decoder only Transformer like GPT.
    • transformer_scratchpad.ipynb - A notebook for trying out new ideas and code snippets.
  • Attention and Transformer Networks.pdf - A collection of notes on the Transformer architecture, compiled from various lectures and the paper.

Use Cases Explored

In this section, we document the specific use cases that have been implemented:

  • Encoder-Only Classification: Demonstrated in encoder_only_classification.ipynb, showcasing how an encoder can be adapted for classification purposes.
  • Work in progress...

Acknowledgments

  • "Attention is All You Need" by Vaswani et al.

  • The educational content from Prof. Pascal Poupart's Machine Learning lectures available on YouTube.

Note: This README is updated regularly to capture the ongoing development and additions to the project.