Let's build GPT: from scratch, in code, spelled out

We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT (meta :D!) . I recommend people watch the earlier makemore videos to get comfortable with the autoregressive language modeling framework and basics of tensors and PyTorch nn, which we take for granted in this video.

PreRequisites

  1. Math (Calculus, Liner Algebra, Matrices)
  2. ML Frameworks
  3. Basic ML knowledge

Some more important links

Metadata

Attribute Value
Name ChatGPT from Scratch
Instructor Andrej Karpathy
Link https://www.youtube.com/watch?v=kCc8FmEb1nY&t=3519s
Tags AI, NL, NLP, Transformers, Theory
Difficulty Advanced
Course Created At 2021-01-17
Notes Created At 2023-03-05