This repo contains an incremental sequence of notebooks designed to teach deep learning, MXNet, and the gluon
interface. Our goal is to leverage the strengths of Jupyter notebooks to present prose, graphics, equations, and code together in one place. If we're successful, the result will be a resource that could be simultaneously a book, course material, a prop for live tutorials, and a resource for plagiarising (with our blessing) useful code. To our knowledge there's no source out there that teaches either (1) the full breadth of concepts in modern deep learning or (2) interleaves an engaging textbook with runnable code. We'll find out by the end of this venture whether or not that void exists for a good reason.
Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view and are making it available for free in its entirety. While the book has a few primary authors to set the tone and shape the content, we welcome contributions from the community and hope to coauthor chapters and entire sections with experts and community members. Already we've received contributions spanning typo corrections through full working examples.
Throughout this book, we rely upon MXNet to teach core concepts, advanced topics, and a full complement of applications. MXNet is widely used in production environments owing to its strong reputation for speed. Now with gluon
, MXNet's new imperative interface (alpha), doing research in MXNet is easy.
To run these notebooks, you'll want to build MXNet from source. Fortunately, this is easy (especially on Linux) if you follow these instructions. You'll also want to install Jupyter and use Python 3 (because it's 2017).
- 0 - Preface
- 1 - Introduction
- 2 - Manipulating data with NDArray
- 3 - Linear Algebra
- 4 - Probability and Statistics
- 5 - Automatic differentiation via
autograd
- 1 - Linear Regression (from scratch)
- 2 - Linear Regression (with
gluon
) - 3 - Multiclass Logistic Regression (from scratch)
- 4 - Multiclass Logistic Regression (with
gluon
) - 5 - Overfitting and regularization (from scratch)
- Roadmap L1 and L2 Regularization (in
gluon
)
- 1 - Multilayer Perceptrons (from scratch!)
- 2 - Multilayer Perceptrons (with
gluon
!) - Roadmap Dropout Regularization (from scratch)
- Roadmap Dropout Regularization (from with
gluon
)
- A look under the hood of
mxnet.gluon
- Writing custom layers with
gluon.Block
- [Serialization: Loading and saving models and parameters]
- Advanced Data IO
- 1 - Convolutional Neural Network (from scratch!)
- 2 - Convolutional Neural Network (with
gluon
!) - 3 - Introduction to Deep CNNs (AlexNet)
- Roadmap Very deep networks and repeating blocks (VGG network)
- Roadmap Batch Normalization (from scratch)
- Roadmap Batch Normalization (from with
gluon
)
- 1 - Simple RNNs (from scratch)
- 2 - LSTMS RNNs (from scratch)
- 3 - GRUs (from scratch)
- 4 - RNNs (with
gluon
) - Roadmap Dropout for recurrent nets
- Roadmap Zoneout regularization
- Roadmap Network of networks (inception & co)
- Roadmap Residual networks
- Object detection
- Roadmap Fully-convolutional networks
- Roadmap Siamese (conjoined?) networks
- Roadmap Embeddings (pairwise and triplet losses)
- Roadmap Inceptionism / visualizing feature detectors
- Roadmap Style transfer
- Fine-tuning
- Roadmap Word embeddings (Word2Vec)
- Roadmap Sentence embeddings (SkipThought)
- Roadmap Sentiment analysis
- Roadmap Sequence-to-sequence learning (machine translation)
- Roadmap Sequence transduction with attention (machine translation)
- Roadmap Named entity recognition
- Roadmap Image captioning
- Tree-LSTM for semantic relatedness
- Roadmap Introduction to autoencoders
- Roadmap Convolutional autoencoders (introduce upconvolution)
- Roadmap Denoising autoencoders
- Roadmap Variational autoencoders
- Roadmap Clustering
- Roadmap Two Sample Tests
- Roadmap Finding adversarial examples
- Roadmap Adversarial training
- Roadmap Introduction to GANs
- Roadmap DCGAN
- Roadmap Wasserstein-GANs
- Roadmap Energy-based GANS
- Roadmap Conditional GANs
- Roadmap Image transduction GANs (Pix2Pix)
- Roadmap Learning from Synthetic and Unsupervised Images
- Roadmap Introduction to reinforcement learning
- Roadmap Deep contextual bandits
- Roadmap Deep Q-networks
- Roadmap Policy gradient
- Roadmap Actor-critic gradient
- Roadmap Dropout-based uncertainty estimation (BALD)
- Roadmap Weight uncertainty (Bayes-by-backprop)
- Roadmap Variational autoencoders
- Roadmap SGD
- Roadmap Momentum
- Roadmap AdaGrad
- Roadmap RMSProp
- Roadmap Adam
- Roadmap AdaDelta
- Roadmap SGLD / SGHNT
- Roadmap Distributed optimization (Asynchronous SGD, ...)
- Training with Multiple GPUs
- Fast & flexible: combining imperative & symbolic nets with HybridBlocks
- Roadmap Training with Multiple Machines
- Roadmap Combining imperative deep learning with symbolic graphs
- Custom Operators
- ...
- Roadmap Intro to automatic speech recognition
- Roadmap Connectionist temporal classification (CSC) for unaligned sequences
- Roadmap Combining static and sequential data
- Roadmap Latent factor models
- Roadmap Deep latent factor models
- Roadmap Bilinear models
- Roadmap Learning from implicit feedback
- Roadmap Forecasting
- Roadmap Modeling missing data
- Roadmap Combining static and sequential data
- Roadmap
gluon
- Roadmap PyTorch to MXNet
- Roadmap Tensorflow to MXNet
- Roadmap Keras to MXNet
- Roadmap Math to MXNet
I've designed these tutorials so that you can traverse the curriculum in one of three ways.
- Anarchist - Choose whatever you want to read, whenever you want to read it.
- Imperialist - Proceed through all tutorials in order. In this fashion you will be exposed to each model first from scratch, writing all the code ourselves but for the basic linear algebra primitives and automatic differentiation.
- Capitalist - If you don't care how things work (or already know) and just want to see working code in
gluon
, you can skip (from scratch!) tutorials and go straight to the production-like code using the high-levelgluon
front end.
This evolving creature is a collaborative effort. So far, some amount of credit (and blame) can be shared by:
- Zachary C. Lipton (@zackchase)
- Mu Li (@mli)
- Alex Smola (@smolix)
- Eric Junyuan Xie (@piiswrong)
In creating these tutorials, I have drawn inspitation from some the resources that me to learn machine learning and how to program with Theano and PyTorch:
- Soumith Chintala's Deep Learning with PyTorch: A 60 Minute Blitz
- Alec Radford's Bare-bones intro to Theano
- Video of Alec's intro to deep learning with Theano
- Chris Bishop's Pattern Recognition and Machine Learning
- Already, in the short time this project has been off the ground, we've gotten some helpful PRs from the community with pedagogical suggestions, typo corrections, and other useful fixes. If you're inclined, please contribute!