Deep Learning - The Straight Dope

Abstract

This repo contains an incremental sequence of notebooks designed to teach deep learning, MXNet, and the gluon interface. Our goal is to leverage the strengths of Jupyter notebooks to present prose, graphics, equations, and code together in one place. If we're successful, the result will be a resource that could be simultaneously a book, course material, a prop for live tutorials, and a resource for plagiarising (with our blessing) useful code. To our knowledge there's no source out there that teaches either (1) the full breadth of concepts in modern deep learning or (2) interleaves an engaging textbook with runnable code. We'll find out by the end of this venture whether or not that void exists for a good reason.

Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view and are making it available for free in its entirety. While the book has a few primary authors to set the tone and shape the content, we welcome contributions from the community and hope to coauthor chapters and entire sections with experts and community members. Already we've received contributions spanning typo corrections through full working examples.

Implementation in MXNet

Throughout this book, we rely upon MXNet to teach core concepts, advanced topics, and a full complement of applications. MXNet is widely used in production environments owing to its strong reputation for speed. Now with gluon, MXNet's new imperative interface (alpha), doing research in MXNet is easy.

Dependencies

To run these notebooks, you'll want to build MXNet from source. Fortunately, this is easy (especially on Linux) if you follow these instructions. You'll also want to install Jupyter and use Python 3 (because it's 2017).

Part 1: Crashcourse

Part 2: Introduction to Supervised Learning

1 - Linear Regression (from scratch)
2 - Linear Regression (with gluon)
3 - Multiclass Logistic Regression (from scratch)
4 - Multiclass Logistic Regression (with gluon)
5 - Overfitting and regularization (from scratch)
Roadmap L1 and L2 Regularization (in gluon)

Part 3: Deep neural networks (DNNs)

1 - Multilayer Perceptrons (from scratch!)
2 - Multilayer Perceptrons (with gluon!)
Roadmap Dropout Regularization (from scratch)
Roadmap Dropout Regularization (from with gluon)

Part 3.5: Plumbing

A look under the hood of mxnet.gluon
Writing custom layers with gluon.Block
[Serialization: Loading and saving models and parameters]
Advanced Data IO

Part 4: Convolutional neural networks (CNNs)

1 - Convolutional Neural Network (from scratch!)
2 - Convolutional Neural Network (with gluon!)
Roadmap Batch Normalization (from scratch)
Roadmap Batch Normalization (from with gluon)

Part 5: Recurrent neural networks (RNNs)

1 - Simple RNNs (from scratch)
Roadmap Simple RNNs (with gluon)
3 - LSTMS RNNs (from scratch)
Roadmap LSTMS (with gluon)
Roadmap GRUs (from scratch)
Roadmap GRUs (with gluon)
Roadmap Dropout for recurrent nets
Roadmap Zoneout regularization

Part 6: Computer vision (CV)

Roadmap Network of networks (inception & co)
Roadmap Residual networks
Object detection
Roadmap Fully-convolutional networks
Roadmap Siamese (conjoined?) networks
Roadmap Embeddings (pairwise and triplet losses)
Roadmap Inceptionism / visualizing feature detectors
Roadmap Style transfer

Part 7: Natural language processing (NLP)

Roadmap Word embeddings (Word2Vec)
Roadmap Sentence embeddings (SkipThought)
Roadmap Sentiment analysis
Roadmap Sequence-to-sequence learning (machine translation)
Roadmap Sequence transduction with attention (machine translation)
Roadmap Named entity recognition
Roadmap Image captioning

Part 8: Unsupervised Learning

Roadmap Introduction to autoencoders
Roadmap Convolutional autoencoders (introduce upconvolution)
Roadmap Denoising autoencoders
Roadmap Variational autoencoders
Roadmap Clustering

Part 9: Adversarial learning

Roadmap Two Sample Tests
Roadmap Finding adversarial examples
Roadmap Adversarial training

Part 10: Generative adversarial networks (GANs)

Roadmap Introduction to GANs
Roadmap DCGAN
Roadmap Wasserstein-GANs
Roadmap Energy-based GANS
Roadmap Conditional GANs
Roadmap Image transduction GANs (Pix2Pix)
Roadmap Learning from Synthetic and Unsupervised Images

Part 11: Deep reinforcement learning (DRL)

Roadmap Introduction to reinforcement learning
Roadmap Deep contextual bandits
Roadmap Deep Q-networks
Roadmap Policy gradient
Roadmap Actor-critic gradient

Part 12: Variational methods and uncertainty

Roadmap Dropout-based uncertainty estimation (BALD)
Roadmap Weight uncertainty (Bayes-by-backprop)
Roadmap Variational autoencoders

Part 13: Optimization

Roadmap SGD
Roadmap Momentum
Roadmap AdaGrad
Roadmap RMSProp
Roadmap Adam
Roadmap AdaDelta
Roadmap SGLD / SGHNT

Part 14: Optimization, Distributed and high-performance learning

Roadmap Distributed optimization (Asynchronous SGD, ...)
Training with Multiple GPUs
Fast & flexible: combining imperative & symbolic nets with HybridBlocks
Roadmap Training with Multiple Machines
Roadmap Combining imperative deep learning with symbolic graphs

Part 15: Hacking MXNet

Custom Operators
...

Part 16: Audio Processing

Roadmap Intro to automatic speech recognition
Roadmap Connectionist temporal classification (CSC) for unaligned sequences
Roadmap Combining static and sequential data

Part 17: Recommender systems

Roadmap Latent factor models
Roadmap Deep latent factor models
Roadmap Bilinear models
Roadmap Learning from implicit feedback

Part 18: Time series

Roadmap Forecasting
Roadmap Modeling missing data
Roadmap Combining static and sequential data

Appendix 1: Cheatsheets

Roadmap gluon
Roadmap PyTorch to MXNet
Roadmap Tensorflow to MXNet
Roadmap Keras to MXNet
Roadmap Math to MXNet

Choose your own adventure

I've designed these tutorials so that you can traverse the curriculum in one of three ways.

Anarchist - Choose whatever you want to read, whenever you want to read it.
Imperialist - Proceed through all tutorials in order. In this fashion you will be exposed to each model first from scratch, writing all the code ourselves but for the basic linear algebra primitives and automatic differentiation.
Capitalist - If you don't care how things work (or already know) and just want to see working code in gluon, you can skip (from scratch!) tutorials and go straight to the production-like code using the high-level gluon front end.

Authors

This evolving creature is a collaborative effort. So far, some amount of credit (and blame) can be shared by:

Zachary C. Lipton (@zackchase)
Mu Li (@mli)
Alex Smola (@smolix)
Eric Junyuan Xie (@piiswrong)

Inspiration

In creating these tutorials, I have drawn inspitation from some the resources that me to learn machine learning and how to program with Theano and PyTorch:

Contribute

Already, in the short time this project has been off the ground, we've gotten some helpful PRs from the community with pedagogical suggestions, typo corrections, and other useful fixes. If you're inclined, please contribute!

robdefeo/mxnet-the-straight-dope