YSDA Natural Language Processing course
- Lecture and seminar materials for each week are in ./week* folders
- YSDA homework deadlines are listed in Anytask course page.
- Any technical issues, ideas, bugs in course materials, contribution ideas - add an issue
- Installing libraries and troubleshooting: this thread.
Syllabus
-
week01 Embeddings
- Lecture: Word embeddings. Distributional semantics, LSA, Word2Vec, GloVe. Why and when we need them.
- Seminar: Playing with word and sentence embeddings.
-
week02 Text classification
- Lecture: Text classification. Classical approaches for text representation: BOW, TF-IDF. Neural approaches: embeddings, convolutions, RNNs
- Seminar: Salary prediction with convolutional neural networks; explaining network predictions.
-
week03 Language Models
- Lecture: Language models: N-gram and neural approaches; visualizing trained models
- Seminar: Generating ArXiv papers with language models
-
week04 Seq2seq/Attention
- Lecture: Seq2seq: encoder-decoder framework. Attention: Bahdanau model. Self-attention, Transformer. Pointer networks. Attention for analysis.
- Seminar: Machine translation of hotel and hostel descriptions
-
week05 Structured Learning
- Lecture: Structured Learning: structured perceptron, structured prediction, dynamic oracles, RL basics.
- Seminar: POS tagging
-
week06 Expectation-Maximization
- Lecture: Expectation-Maximization and Word Alignment Models
- Seminar: Implementing expectation maximizaiton
-
week07 Machine translation
- Lecture: Machine Translation: a review of the key ideas from PBMT, the application specific ideas that have developed in NMT over the past 3 years and some of the open problems in this area.
- Seminar: presentations by students
-
week08 Transfer learning and Multi-task learning
- Lecture: What and why does a network learn: "model" is never just "model"! Transfer learning in NLP. Multi-task learning in NLP. How to understand, what kind of information the model representations contain.
- Seminar: Improving named entity recognition by learning jointly with other tasks
-
week09 Domain Adaptation
- Lecture: General theory. Instance weighting. Proxy-labels methods. Feature matching methods. Distillation-like methods.
- Seminar: Adapting general machine translation model to a specific domain.
-
week10 Dialogue Systems
- Lecture: Task-oriented vs general conversation systems. Overview of a framework for task-oriented systems. General conversation: retrieval and generative approaches. Generative models for general conversation. Retrieval-based models for general conversation.
- Seminar: Simple retrieval-based question answering
-
week11 Adversarial learning & Latent Variables for NLP
- Lecture: generative models recap, generative adversarial networks, variational autoencoders and why should you care about them.
- Seminar: semi-supervised dictionary learning with adversarial networks
-
week12 Text Summarization
- Lecture: Text summarization methods. Extractive vs abstractive. A piece of extractive text summarization. Abstractive text summarization.
Contributors & course staff
Course materials and teaching performed by
- Elena Voita - course admin, lectures, seminars, homeworks
- Boris Kovarsky - lectures, seminars, homeworks
- David Talbot - lectures, seminars, homeworks
- Sergey Gubanov - lectures, seminars, homeworks
- Just Heuristic - lectures, seminars, homeworks