This course was created by Prof. Mohammad Ghassemi in Fall of 2020 as part of the CSE 842 class at Michigan State University. The course provides a step-by-step guide to NLP and makes no assumptions that you have a background in the material (NLP or Machine Learning). The content in this repository will teach you:
- How to collect and process text data.
- How to generate text using language models.
- How to classify text using machine learning.
- How to use and tune state-of-the-art sequence-to-sequence models, including transformers.
- How to process speech signals.
All lectures are hosted on Youtube and can be consumed at your own pace (see links below). At the end of (most) every lecture there is a tutorial + homework assignment that will demonstrate how to perform NLP tasks in Python. The Python Notebooks are available through the links below, and in the Homework
folder.
- Lectures:
- HW0: Setting up your notebook and Gitlab Repo
- Project: Guidelines
- Optional Readings:
- Lecture:
- HW1 and Code Tutorial: Basic data manipulations, representations and statistics
- Optional Readings:
- Lecture:
- HW2 and Code Tutorial: Supervised language classification models and their assessment
- Optional Readings
- Lecture:
- HW3 and Code Tutorial: Embeddings and Neural Networks
- Optional Readings
- Lecture:
- HW4 and Code Tutorial: Sequence Models
- Optional Readings
- Lecture:
- HW5 and Code Tutorial: Transformers
- Optional Readings
- Lecture:
- HW6 and Code Tutorial: Context free grammar
- Optional Readings
- Lecture:
- HW7 and Code Tutorial: Speech Analysis