Deep NLP Reading List

This serves as my own detailed roadmap and reading list/notes for studying Deep Learning and/with NLP. Each section will refer to useful materials that can help, including MOOCs, blog posts, books, lecture notes, papers, and other awesome paper lists and roadmaps.

Mathematical Foundations
- Basics
- Advanced
Machine Learning
Deep Learning
Statistical NLP
Deep Learning for NLP

Mathematical Foundations

Basics

[Back To TOC]

If you are confident in these math subjects, you can just skip this part or simply take a look at some refreshers.

Linear Algebra
- Refreshers
  - Youtube Playlist: Essence of Linear Algebra
    - Approxmiately 2 hour long videos with Very Good Visualizations and clear explanations
  - Khan Academy Linear Algebra
- MIT Linear Algebra
Multivariable Calculus
- Refreshers
  - Youtube Playlist: Essence of Calculus
  - Highlights of Calculus: Video lectures by Prof. Gilbert Strang, MIT
  - Khan Academy Multivariable Calculus
- MIT Multivariable Calculus
Probability and Statistics
- Refreshers
  - Deep Learning Book Chapter 3 - Probability and Information Theory
  - Chapters 1, 2, and 11 of 'Pattern Recognition and Machine Learning' by Bishop (2006)
  - Khan Academy Probability and Statistics
- Harvard STAT110
- Readings
  - Chapters 2~6 of 'Machine Learning A Probabilistic Perspective' by Murphy (2012)
  - Lecture Notes: 'Probability and Statistics for Data Science'

Advanced

[Back To TOC]

The following subjects are some advanced materials that could be useful in understanding many Deep Learning theories and NLP. Particularly relevant ones are bolded.

Information Theory
- Refreshers
  - Deep Learning Book Chapter 3 - Probability and Information Theory
  - Khan Academy Information Theory
Statistical Inference
- CMU Intermediate Statistics
Advanced Probability
Random Matrix Theory
- MIT Eigenvalues of Random Matrices
Stochastic Processes
- Coursera Stochastic Processes
- MIT Discrete Stocahstic Processes
- UIUC Notes on Random Processes
Opimization Theory
Convex Optimization
Vector Calculus
Numerical Linear Algebra
- fast.ai Computational Linear Algebra: Focuses on applying what we learned from Linear Algebra to practical Data Science tasks.
Abstract Algebra
Real and Complex Analysis
Theories of Deep Learning
- Stanford STATS385

Machine Learning

[Back To TOC]

Machine Learning without Deep Learning.

Refreshers
- Kyunghyun Cho's ML w/o DL Lecture Notes
Introductory
- Andrew Ng's Machine Learning on Coursera
- Yaser Abu-Mostafa's Learning From Data
Advanced
- Tom Mitchell's Machine Learning
- CMU Intro to ML 10-701
- CMU Advanced Intro to ML
Readings
- 'Pattern Recognition and Machine Learning' by Bishop (2006)
- 'Machine Learning A Probabilistic Perspective' by Murphy (2012)

Deep Learning

[Back To TOC]

Refreshers
- Chapters 1, 2, 3, and 4 of Kyunghyun Cho's Natural Language Understanding with Distributed Representation Lecture Notes
- Andrew Ng's Deep Learning courses deeplearning.ai
CMU Introduction to Deep Learning
Stanford CS231n Convolutional Neural Networks for Computer Vision
- Youtube Playlist
Books
- Deep Learning Book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
- Neural Networks and Deep Learning by Michael Neilson
Papers: Full list organized by topics and models can be found in Deep-Learning-Papers-Reading-Roadmap, or Columbia's seminar course Advanced Topics in Deep Learning - Reading List
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444 [pdf]: A high-level survey paper by the three giants
- Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. "Gradient-Based Learning Applied to Document Recognition." Proceedings of the IEEE, 86(11):2278-2324. 1998 (Seminal Paper: LeNet) [pdf]: LeNet: Image Classification on Handwritten Digits
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. [pdf]: Big hit of Deep Learning, AlexNet
Blog Posts
- The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy
- WildML Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs

Statistical NLP

[Back To TOC]

Refreshers
- Chapters 6~9.3 of Goldberg book
Columbia Michael Collins' COMS W4705: Natural Language Processing: This course covers a lot of traditional techniques often used in NLP.
- Notes on Statistical NLP
- Video Lectures on Coursera: Can't find the course anymore, but there are Youtube videos
Chris Manning's CS 224N/Ling 284 — Natural Language Processing before merging with Richard Socher's CS224D, covers some missing pieces of Michael Collins' class, along with more real life applications such as Machine Translation.
- Video Lectures on Youtube
Readings
- 'Foundations of Statistical Natural Language Processing' by Manning and Schütze (1999)
- Speech and Language Processing drafts by Dan Jurafsky and James Martin