/awesome-dl4nlp

A curated list of awesome Deep Learning for Natural Language Processing resources

Awesome Deep Learning for Natural Language Processing (NLP) Awesome

Table of Contents

Courses

  1. NLP with Deep Learning / CS224N from Stanford (Winter 2019)
  2. Neural Networks for NLP from Carnegie Mellon University
  3. Deep Learning for Natural Language Processing from University of Oxford and DeepMind

Books

  1. Deep Learning with Text: Natural Language Processing (Almost) from Scratch with Python and spaCy by Patrick Harrison and Matthew Honnibal
  2. Neural Network Methods in Natural Language Processing by Yoav Goldberg and Graeme Hirst
  3. Deep Learning in Natural Language Processing by Li Deng and Yang Liu
  4. Natural Language Processing in Action by Hobson Lane, Cole Howard, and Hannes Hapke
  5. Deep Learning: Natural Language Processing in Python by The LazyProgrammer (Kindle only)
    1. Word2Vec and Word Embeddings in Python and Theano
    2. From Word2Vec to GLoVe in Python and Theano
    3. Recursive Neural Networks: Recursive Neural (Tensor) Networks in Theano
  6. Applied Natural Language Processing with Python by Taweh Beysolow II
  7. Deep Learning Cookbook by Douwe Osinga
  8. Deep Learning for Natural Language Processing: Creating Neural Networks with Python by Palash Goyal, Sumit Pandey, Karan Jain
  9. Machine Learning for Text by Charu C. Aggarwal
  10. Natural Language Processing with TensorFlow by Thushan Ganegedara
  11. fastText Quick Start Guide: Get started with Facebook's library for text representation and classification
  12. Hands-On Natural Language Processing with Python

Tutorials

  1. Text classification guide from Google
  2. Deep Learning for NLP with PyTorch

Talks

  1. Deep Learning for Natural Language Processing (without Magic)
  2. A Primer on Neural Network Models for Natural Language Processing
  3. Deep Learning for Natural Language Processing: Theory and Practice (Tutorial)
  4. TensorFlow Tutorials
  5. Practical Neural Networks for NLP from EMNLP 2016 using DyNet framework
  6. Recurrent Neural Networks with Word Embeddings
  7. LSTM Networks for Sentiment Analysis
  8. TensorFlow demo using the Large Movie Review Dataset
  9. LSTMVis: Visual Analysis for Recurrent Neural Networks
  10. Using deep learning in natural language processing by Rob Romijnders from PyData Amsterdam 2017
  11. Richard Socher's talk on sentiment analysis, question answering, and sentence-image embeddings
  12. Deep Learning, an interactive introduction for NLP-ers
  13. Deep Natural Language Understanding
  14. Deep Learning Summer School, Montreal 2016 Includes state-of-art language modeling.
  15. Tackling the Limits of Deep Learning for NLP by Richard Socher

Frameworks

  1. Overview of DL frameworks for NLP

  2. General Frameworks

    1. Keras - The Python Deep Learning library Emphasis on user friendliness, modularity, easy extensibility, and Pythonic.
    2. TensorFlow - A cross-platform, general purpose Machine Intelligence library with Python and C++ API.
    3. PyTorch - PyTorch is a deep learning framework that puts Python first. "Tensors and Dynamic neural networks in Python with strong GPU acceleration."
  3. Specific Frameworks

    1. SpaCy - A Python package designed for speed, getting things dones, and interoperates with other Deep Learning frameworks
    2. Genism: Topic modeling for humans - A Python package that includes word2vec and doc2vec implementations.
    3. fasttext Facebook's library for fast text representation and classification.
    4. Built on TensorFlow
      1. SyntaxNet - A toolkit for natural language understanding (NLU).
      2. textsum - A Sequence-to-Sequence with Attention Model for Text Summarization.
      3. Skip-Thought Vectors implementation in TensorFlow.
      4. ActiveQA: Active Question Answering - Using reinforcement learning to train artificial agents for question answering
      5. BERT - Bidirectional Encoder Representations from Transformers for pre-trained models
    5. Built on PyTorch
      1. PyText - A deep-learning based NLP modeling framework by Facebook
      2. AllenNLP - An open-source NLP research library
      3. Flair - A very simple framework for state-of-the-art NLP
      4. fairseq - A Sequence-to-Sequence Toolkit
      5. fastai - Simplifies training fast and accurate neural nets using modern best practices
      6. Transformer model - Annotated notebook implementation
    6. Deeplearning4j’s NLP framework - Java implementation.
    7. DyNet - The Dynamic Neural Network Toolkit "work well with networks that have dynamic structures that change for every training instance".
    8. deepnl - A Python library for NLP based on Deep Learning neural network architecture.

Papers

  1. Deep or shallow, NLP is breaking out - General overview of how Deep Learning is impacting NLP.
  2. Natural Language Processing from Research at Google - Not all Deep Learning (but mostly).
  3. Context Dependent Recurrent Neural Network Language Model
  4. Translation Modeling with Bidirectional Recurrent Neural Networks
  5. Contextual LSTM (CLSTM) models for Large scale NLP tasks
  6. LSTM Neural Networks for Language Modeling
  7. Exploring the Limits of Language Modeling
  8. Conversational Contextual Cues - Models context and participants in conversations.
  9. Sequence to sequence learning with neural networks
  10. Efficient Estimation of Word Representations in Vector Space
  11. Learning Character-level Representations for Part-of-Speech Tagging
  12. Representation Learning for Text-level Discourse Parsing
  13. Fast and Robust Neural Network Joint Models for Statistical Machine Translation
  14. Parsing With Compositional Vector Grammars
  15. Smart Reply: Automated Response Suggestion for Email
  16. Neural Architectures for Named Entity Recognition - State-of-the-art performance in NER with bidirectional LSTM with a sequential conditional random layer and transition-based parsing with stack LSTMs.
  17. Grammar as a Foreign Language - State-of-the-art syntactic constituency parsing using generic sequence-to-sequence approach.

Blog Posts

  1. Natural Language Processing (NLP) progress Tracking the most common NLP tasks, including the datasets and the current state-of-the-art
  2. A Review of the Recent History of Natural Language Processing
  3. Deep Learning, NLP, and Representations
  4. The Unreasonable Effectiveness of Recurrent Neural Networks
  5. Neural Language Modeling From Scratch
  6. Machine Learning for Emoji Trends
  7. Teaching Robots to Feel: Emoji & Deep Learning
  8. Computational Linguistics and Deep Learning - Opinion piece on how Deep Learning fits into the broader picture of text processing.
  9. Deep Learning NLP Best Practices
  10. 7 types of Artificial Neural Networks for Natural Language Processing
  11. How to solve 90% of NLP problems: a step-by-step guide

Datasets

  1. Dataset from "One Billion Word Language Modeling Benchmark" - Almost 1B words, already pre-processed text.
  2. Stanford Sentiment Treebank - Fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences.
  3. Chatbot data from Kaggle
  4. A list of text datasets that are free/public domain in alphabetical order
  5. Another list of text datasets that are free/public domain in reverse chronological order
  6. Question Answering datasets
    1. Quora's Question Pairs Dataset - Identify question pairs that have the same intent.
    2. CMU's Wikipedia Factoid Question Answers
    3. DeepMind's Algebra Question Answering
    4. DeepMind's from CNN & DailyMail Question Answering
    5. Microsoft's WikiQA Open Domain Question Answering
    6. Stanford Question Answering Dataset (SQuAD) - covering reading comprehension

Word Embeddings and friends

  1. The amazing power of word vectors from The Morning Paper blog
  2. Distributed Representations of Words and Phrases and their Compositionality - The original word2vec paper.
  3. word2vec Parameter Learning Explained An elucidating explanation of word2vec training
  4. Word embeddings in 2017: Trends and future directions
  5. Learning Word Vectors for 157 Languages
  6. GloVe: Global Vectors for Word Representation - A "count-based"/co-occurrence model to learn word embeddings.
  7. Doc2Vec
  8. Dynamic word embeddings for evolving semantic discovery from The Morning Paper blog
  9. Ali Ghodsi's lecture on word2vec:
  10. word2vec analogy demo
  11. TensorFlow Embedding Projector of word vectors
  12. Skip-Thought Vectors - "unsupervised learning of a generic, distributed sentence encoder"

Contributing

Have anything in mind that you think is awesome and would fit in this list? Feel free to send me a pull request!


License

CC0

To the extent possible under law, Dr. Brian J. Spiering has waived all copyright and related or neighboring rights to this work.