/transformers

A lecture-style exploration of transformers.

Primary LanguageJupyter Notebook


Transformers


Author: Chelsea Zaloumis

Last update: 4/15/2021

A lecture-style exploration of transformers following Jay Alammar's post The Illustrated Transformer. Includes breakout questions and motivating examples.

Lecture objectives:

  1. Motivation for Transformers
  2. Define Transformers
  3. Define Self-Attention
    1. Self-Attention with vectors
    2. Self-Attention with matrices
  4. Define Multi-Head Attention
  5. Define Encoder-Decoder Attention layer
  6. Final Linear & Softmax Layers
  7. Loss Function

References/Resources

Further Work

  1. Coding a basic transformer for natural language processing.
  2. Coding a not-so-basic transformer for tbd application.