
This repository contains the materials for natural language processing courses from deep learning ai NLP series.

Primary LanguageHTML


This repository contains the materials for natural language processing courses from deep learning ai NLP series.

Course 1: Natural Language Processing with Classification and Vector Spaces

Topics: Use logistic regression, naïve Bayes, and word vectors to implement sentiment analysis, complete analogies & translate words.

Week1: Logistic Regression

In this class, I learnt to extract features from text into numerical vectors, then build a binary classifer for tweets using a logistic regression.
Topics: Sentiment analysis, Logistic regression, Data pre-processing, Calculating word frequencies, Feature extraction, Vocabulary creation, Supervised learning



Week2: Naive Bayes

Learn the theory behind Bayes' rule for conditional probabilities, then apply it toward building a Naive Bayes tweet classifier.
Topics: Error analysis, Naive Bayes inference, Log likelihood, Laplacian smoothing, conditional probabilities, Bayes rule, Sentiment analysis



Week3: Vector Space Models

Vector space models capture semantic meaning and relationships between words. I learnt how to create word vectors that capture dependencies between words, then visualize their relationships in two dimensions using PCA.
Topics: Covariance matrices, Dimensionality reduction, Principal component analysis, Cosine similarity, Euclidean distance, Co-occurrence matrices, Vector representations, Vector space models


Week4: Machine Translation

Learnt to transform word vectors and assign them to subsets using locality sensitive hashing, in order to perform machine translation and document search.
Topics: Gradient descent, Approximate nearest neighbors, Locality sensitive hashing, Hash functions, Hash tables, K nearest neighbors, Document search, Machine translation, Frobenius norm



Course 2: Natural Language Processing with Probabilistic Models

Topics: Word2vec, Parts-of-Speech tagging, N-gram language models, autocorrect

Week1: Autocorrect and minimum edit distance.

In this class, I learnt to apply different edit operations (delete, insert, switch, replace) to build a simple auto correct model. Topics: Autocorrect, minimum edit distance, edit operations.



Week2: Part of speech tagging

In this class, the objective is to use Markov chains and hidden markov models to create part-of-speech tags for text corpus. Topics: Markov chains, Hidden Markov models, Part-of-speech tagging, Viterbi algorithm, Transition probabilities, Emission probabilities.



Week3: Autocomplete and Language Models

Learnt about using N-gram language models by calculating sequence probabilities. Topics: Language modeling, perplexity, K-smoothing, N-grams, Backoff, Tokenization.



Week4: Word embeddings with neural networks

Learnt about how to use word embeddings to carry the semantic meaning of words and build continuous-bag-of-words model.
Topics: Gradient descent, one-hot vectors, neural networks, word embeddings, continuous bag-of-words model, text pre-processing, tokenization.



Course 3: Natural Language Processing with Sequence Models


Week1: Neural Networks for Sentiment Analysis.

Learnt about neural networks for deep learning, then build a sophisticated tweet classifier that places tweets into positive or negative sentiment categories, using a deep neural network. Topics: Feature extraction, Supervised machine learning, Text preprocessing, ReLU, Neural networks.



Week2: Recurrent Neural Networks for Language Modeling.

Learnt about the limitation of traditional language models and see how RNNs and GRUs use sequential data for text prediction.
Topics: N-grams, Gated recurrent units, Recurrent neural networks.



Week3: LSTMs and Named Entity Recognition.

Learnt about how long short-term memory units (LSTMs) solve the vanishing gradient problem, and how Named Entity Recognition systems quickly extract important information from text.
Topics: Vanishing gradients, Named entity recognition, LSTMs, Feature extraction, Part-of-speech tagging, Data generators.

