/nlp_fundamentals

Primary LanguageJupyter Notebook

Resources: https://aiplus.odsc.com/courses/nlp-fundamentals http://nlp.seas.harvard.edu/

Lesson 1 Text Representation (60m) Theory: Familiarize yourself with NLP fundamentals and text preprocessing, to prepare the data for our models. We will go through the main steps like removing stopwords, stesmming, One-Hot Encoding, and more.

Exercise: Apply text preprocessing methods on a simple dataset.

Outcome: You will be able to apply to the appropriate methodology to preprocess the text.

Lesson 2 Topic Modeling (45m) Theory: We will see what LDA is and how it can help to extract information from documents. We will also try different clustering techniques and implement a Non-negative Matrix factorization.

Exercise: Apply topic modeling techniques on a simple text.

Outcome: You will be able to apply to extract the main information from documents using topic modeling techniques.

Lesson 3 Text Classification (30m) Theory: We will learn how it’s possible to represent text and how a classifier can use this representation. We will use TF-Idf and experiment with a couple of supervised learning models.

Exercise: Build an NLP pipeline to perform classification.

Outcome: You will be able to solve a text classification problem end to end.

Lesson 4 Introduction to Deep Learning in NLP (45m) Theory: Understand word embedding, how it works, and how to use it. We will go through the main concepts behind word embedding and see some practical examples using the Gensim library.

Exercise: Leveraging python deep learning libraries to create an NLP pipeline for sentiment analysis.

Outcome: You will be able to use word embedding to perform any text classification task.

Lesson 5 Overview of Advanced Deep NLP (15m)

We will introduce the most recent development of Deep learning in NLP, in particular we will see how to leverage BERT and ELMo and their pre-trained models to solve NLP problems.