/DeepNLP-Course

Deep NLP Course

Primary LanguageJupyter Notebook

Deep NLP Course at ABBYY

Deep learning for NLP crash course at ABBYY.

Suggested textbook: Neural Network Methods in Natural Language Processing by Yoav Goldberg

Materials

Week 1: Introduction

Sentiment analysis on the IMDB movie review dataset: a short overview of classical machine learning for NLP + indecently brief intro to keras.

Run in Google Colab View source on GitHub

Week 2: Word Embeddings: Part 1

Meet the Word Embeddings: an unsupervised method to capture some fun relationships between words.
Phrases similarity with word embeddings model + word based machine translation without parallel data (with MUSE word embeddings).

Run in Google Colab View source on GitHub

Week 3: Word Embeddings: Part 2

Introduction to PyTorch. Implementation of pet linear regression on pure numpy and pytorch. Implementations of CBoW, skip-gram, negative sampling and structured Word2vec models.

Run in Google Colab View source on GitHub

Week 4: Convolutional Neural Networks

Introduction to convolutional networks. Relations between convolutions and n-grams. Simple surname detector on character-level convolutions + fun visualizations.

Run in Google Colab View source on GitHub

Week 5: RNNs: Part 1

RNNs for text classification. Simple RNN implementation + memorization test. Surname detector in multilingual setup: character-level LSTM classifier.

Run in Google Colab View source on GitHub

Week 6: RNNs: Part 2

RNNs for sequence labelling. Part-of-speech tagger implementations based on word embeddings and character-level word embeddings.

Run in Google Colab View source on GitHub

Week 7: Language Models: Part 1

Character-level language model for Russian troll tweets generation: fixed-window model via convolutions and RNN model.
Simple conditional language model: surname generation given source language.
And Toxic Comment Classification Challenge - to apply your skills to a real-world problem.

Run in Google Colab View source on GitHub

Week 8: Language Models: Part 2

Word-level language model for poetry generation. Pet examples of transfer learning and multi-task learning applied to language models.

Run in Google Colab View source on GitHub

Week 9: Seq2seq

Seq2seq for machine translation and image captioning. Byte-pair encoding, beam search and other usefull stuff for machine translation.

Run in Google Colab View source on GitHub

Week 10: Seq2seq with Attention

Seq2seq with attention for machine translation and image captioning.

Run in Google Colab View source on GitHub

Week 11: Transformers & Text Summarization

Implementation of Transformer model for text summarization. Discussion of Pointer-Generator Networks for text summarization.

Run in Google Colab View source on GitHub

Week 12: Dialogue Systems: Part 1

Goal-orientied dialogue systems. Implemention of the multi-task model: intent classifier and token tagger for dialogue manager.

Run in Google Colab View source on GitHub

Week 13: Dialogue Systems: Part 2

General conversation dialogue systems and DSSMs. Implementation of question answering model on SQuAD dataset and chit-chat model on OpenSubtitles dataset.

Run in Google Colab View source on GitHub

Week 14: Pretrained Models

Pretrained models for various tasks: Universal Sentence Encoder for sentence similarity, ELMo for sequence tagging (with a bit of CRF), BERT for SWAG - reasoning about possible continuation.

Run in Google Colab View source on GitHub

Final Presentation

NLP Summary - summary of cool stuff that appeared and didn't in the course.