Last update: 4/15/2021
A lecture-style exploration of transformers following Jay Alammar's post The Illustrated Transformer. Includes breakout questions and motivating examples.
- Motivation for Transformers
- Define Transformers
- Define Self-Attention
- Self-Attention with vectors
- Self-Attention with matrices
- Define Multi-Head Attention
- Define Encoder-Decoder Attention layer
- Final Linear & Softmax Layers
- Loss Function
-
Neural Machine Translation by Jointly Learning to Align and Translate
-
Effective Approaches to Attention-based Neural Machine Translation
-
Creating Word Embeddings: Coding the Word2Vec Algorithm in Python using Deep Learning
-
GloVe: Global Vectors for Word Representation For word embedding weights
- Coding a basic transformer for natural language processing.
- Coding a not-so-basic transformer for tbd application.