
My assignments for the Coursera courses 'Machine Learning Specialization'(University of Washington). In those assignments, I prefer packages numpy, pandas and scikitlearn than graphlap, sframe provied by the instructors. There is slight difference in syntax between two kinds of python packages.

Syllabus of Machine Learning Specialization

A Case Study Approach

  1. Introduction
  2. Regression - Predicting House Prices
  3. Classification - Sentiment Anlysis
  4. Clustering and Similarity - Retrieving Documents
  5. Recommending Products (Excluded)
  6. Deep Learning - Searching for Images (Excluded)


Week1: Simple Regression - Linear Regression with one input

  1. Introduction
  2. Predicting house preices(one feature)

Week2: Multiple Regression - Linear Regression with Multiple Features

  1. Exploring different mutiple regression models for house prices prediction (multiple variables)
  2. Implementing gradient descent for multiple regression

Week3: Assessing Performance

Exploring the bias-variance tradeoff

Week4: Ridge Regression - Regulating Overfittting When Using Many Features

  1. Observing effects of L2 penalty in polynomial regression
  2. Implementing ridge regression via gradient descent

Week5: Lasso Regression - Regularization for Feature Selection

  1. Using LASSO to select features
  2. Implementing LASSO using coordinate descent

Week6: Going Nonparametric - Nearest Neighbor and Kernal Regression

Predicting house prices using k-nearest neighbors regression


Week1: Linear Classifiers - Logistic Regression

  1. Introduction
  2. Predicting sentiment from product reviews

Week2: Linear Classifiers - Parameter learning, Overfitting & Regularization

  1. Implementing logistic regression from scratch
  2. Logistic regression with L2 regularization

Week3: Decision Trees

  1. Identifying safe loans with decision trees
  2. Implementing binary decision trees

Week4: Overfitting in Decision Trees, Handling Missing Data

  1. Decision trees in practice
  2. 3 Strategies for handling missing data

Week 5: Boosting

  1. Exploring ensemble methods
  2. Boosting a decision stump

Week6: Evaluating Classifiers - Precision & Recall

Exploring precision and recall

Week7: Scaling to Huge Datasets & Online Learning

Training logisitc regression via stochastic gradien ascent

Clustering & Retrieval

Week1: A machine learning perspective


Week2: Nearest Neighbor Search - Retrieving Documents

  1. Choosing features and metrics for nearest neighbor search
  2. Implementing Locality Sensitive Hashing from scratch

Week3: Clustering - Grouping Related Docs

  1. Clustering text data with k-means
  2. MapReduce for scaling k-means

Week4: Mixture Models: Model-based Clustering

  1. Implementing EM for Gaussian mixtures
  2. Clustering text data with Gaussian mixtures

Week5: Latent Dirichlet Allocation: Mixed Membership Modeling

Modeling text topics with Latent Dirichlet Allocation

Week6: Recap & Look ahead

Modeling text data with a hierarchy of clusters