This course will get you from no knowledge of deep learning to training a GPT model. We'll start with the basics, then build up to complex networks.
To use this course, go through each chapter from the beginning. Read the lessons, or watch the optional videos. Then look through the implementations to solidify your understanding. I also recommend implementing each algorithm on your own.
Get an overview of the course and what we'll learn. Includes some math and NumPy fundamentals you'll need for deep learning.
- Lesson: Read the intro
Gradient descent is how neural networks train their parameters to match the data. It's the "learning" part of deep learning.
- Lesson: Read the gradient descent tutorial and watch the optional video
- Implementation: Notebook and class
Dense networks are the basic form of a neural network, where every input is connected to an output. These can also be called fully connected networks.
- Lesson: Read the dense network tutorial and watch the optional video
- Implementation: Notebook and class
In the last two lessons, we learned how to perform regression with neural networks. Now, we'll learn how to perform classification.
- Lesson: Read the classification tutorial
Recurrent neural networks can process sequences of data. They're used for time series and natural language processing.
- Lesson: Read the recurrent network tutorial
- Implementation: Notebook
Regularization prevents overfitting to the training set. This means that the network can generalize well to new data.
- Lesson: Read the regularization tutorial (coming soon)
PyTorch is a framework for deep learning that automates the backward pass of neural networks. This makes it simpler to implement complex networks.
- Lesson: Read the PyTorch tutorial (coming soon)
If you want to train a deep learning model, you need data. Gigabytes of it. We'll discuss how you can get this data and process it.
- Lesson: Read the data tutorial (coming soon)
- Implementation: Notebook coming soon
Encoder/decoders are used for NLP tasks when the output isn't the same length as the input. For example, if you want to use questions/answers as training data, the answers may be a different length than the question.
- Lesson: Read the encoder/decoder tutorial (coming soon)
- Implementation: Notebook
Transformers fix the problem of vanishing/exploding gradients in RNNs by using attention. Attention allows the network to process the whole sequence at once, instead of iteratively.
- Lesson: Read the transformer tutorial (coming soon)
- Implementation: Notebook
GPT models take a long time to train. We can reduce that time by using more GPUs, but we don't all have access to GPU clusters. To reduce training time, we'll incorporate some recent advances to make the transformer model more efficient.
- Lesson: Read the efficient transformer tutorial (coming soon)
- Implementation: Notebook
Convolutional neural networks are used for working with images and time series.
Gated recurrent networks help RNNs process long sequences by helping networks forget irrelevant information. LSTM and GRU are two popular types of gated networks.
- Lesson: Read the GRU tutorial (coming soon)
- Implementation: Notebook
If you want to run these notebooks locally, you'll need to install some Python packages.
- Make sure you have Python 3.8 or higher installed.
- Clone this repository.
- Run
pip install -r requirements.txt
You can use and adapt this material for your own courses, but not commercially. You also must provide attribution.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.