NLP course, MIPT
Anton Emelianov (login-const@mail.ru, @king_menin), Albina Akhmetgareeva (albina.akhmetgareeva@gmail.com)
Videos here
Exam questions here
- Lecture: Intro to NLP
- Practical: Text preprocessing,
- Video
- Lecture: Word embeddings
Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. N-gram (collocations) RusVectores. t-SNE.
- Lecture: RNN + CNN, Text classification
Neural Language Models: Recurrent Models, Convolutional Models. Text classification (architectures)
- Practical: Classification with LSTM, CNN,
- Video
- Lecture: Language modelling and NER
Task description, methods (Markov Model, RNNs), evaluation (perplexity), Sequence Labelling (NER, pos-tagging, chunking etc.) N-gram language models, HMM, MEMM, CRF
Basics: Encoder-Decoder framework, Inference (e.g., beam search), Eval (bleu). Attention: general, score functions, models. Bahdanau and Luong models. Transformer: self-attention, masked self-attention, multi-head attention.
- Lecture: Transfer learning in NLP
Bertology (BERT, GPT-s, t5, etc.), Subword Segmentation (BPE), Evaluation of big LMs.
- Practical: transformers models for classification task,
- Video
Lecture & Practical: How to train big models? Distributed training
Training Multi-Billion Parameter Language Models. Model Parallelism. Data Parallelism.
- Practical: DDP example
- Video
- Lecture: Question answering
- Practical: seminar QA , seminar chat-bots
- Video
Squads (one-hop, multi-hop), architectures, retrieval and search, chat-bots
- Lecture: Summarization, simplification, paraphrasing
- Practical: summarization seminar
- HW3, https://www.kaggle.com/c/mipt-nlp-hw3-2022
- Video
- Lecture: Multimodal NLP
- Video
- ruder.io
- Jurafsky & Martin
- Курс Лауры Каллмайер по МО для NLP
- Курс Нильса Раймерса по DL для NLP
- Курс в Оксфорде по DL для NLP
- Курс в Стенфорде по DL для NLP
- Reinforcment Learning for NLP
- Курс nlp в яндексе
- НКРЯ
- Открытый корпус
- Дистрибутивные семантические модели для русского языка
- Морфология
- Синтаксис
- Томита-парсер
- mathlingvo
- nlpub
- Text Visualisation browser
- Manning, Christopher D., and Hinrich Schütze. Foundations of statistical natural language processing. Vol. 999. Cambridge: MIT press, 1999.
- Martin, James H., and Daniel Jurafsky. "Speech and language processing." International Edition 710 (2000): 25.
- Cohen, Shay. "Bayesian analysis in natural language processing." Synthesis Lectures on Human Language Technologies 9, no. 2 (2016): 1-274.
- Goldberg, Yoav. "Neural Network Methods for Natural Language Processing." Synthesis Lectures on Human Language Technologies 10, no. 1 (2017): 1-309.