Machine Learning by Andrew Ng

This repository contains my answers, in code form and with explanation in pdf form, for the programming assignments in this course. The programming language used is MATLAB.

Exercises

The structures below show a bird's-eye view of how I made each pdf report.

Exercise 1: Linear Regression

The exercise covered and implemented Linear Regression with one variable to predict profits for a food truck. The data contain profits and populations from the cities.

1 Defining the problem and dataset
2 Exploring the data
3 Gradient Descent
   3.1 Update Equations
   3.2 Implementation
   3.3 Computing the Cost
   3.4 Gradient Descent
4 Visualizations

Exercise 2: Logistic Regression

The exercise covered logistic regression applied to two different datasets. The first dataset was used to create a model that will help predict whether a student gets admitted into a university. The second dataset was used to explore the concept of regularization and predict whether microchips from a fabrication plant pass the quality assurance (QA).

1 Logistic Regression
   1.1 Challenge
   1.2 Visualizing the data
   1.3 Implementation
     1.3.1 Hypothesis and Sigmoid Function
     1.3.2 Cost Function and Gradient of the Cost
     1.3.3 Learning parameters using fminunc
     1.3.4 Evaluating logistic regression
2 Regularized Logistic Regression
   2.1 Challenge
   2.2 Visualizing the data
   2.3 Feature Mapping
   2.4 Cost Function and Gradient
   2.5 Plotting the decision boundary

Exercise 3: Multi-class Classification and Neural Networks

The exercise covered the problem of multi-class classification in recognizing hand-written digits and the implementation of the solution using one-vs-all logistic regression and neural networks(feedforward only).

1 Multi-class Classification
   1.1 Challenge
   1.2 Dataset
   1.3 Visualizing the Data
   1.4 Vectorizing Logistic Regression
     1.4.1 Vectorizing the cost function
     1.4.2 Vectorizing the gradient
     1.4.3 Vectorizing regularized logistic regression
   1.5 One-vs-all Classification
     1.5.1 One-vs-all Training
2 One-vs-all VS Neural Nets
3 Neural Networks
   3.1 Model Representation
   3.2 Feedforward Propagation and Prediction

Exercise 4: Neural Networks Learning

The exercise covered the implementation of the backpropagation algorithm for neural networks applied to the task of handwritten digit recognition.

1 Neural Networks
   1.1 Visualizing the data
   1.2 Model Presentation
   1.3 Feedforward and cost function
   1.4 Regularized cost function
2 Backpropagation
   2.1 Sigmoid gradient
   2.2 Random Initialization
   2.3 Backpropagation
   2.4 Regularized Neural Networks

Exercise 5: Regularized Linear Regression and Bias VS Variance

The exercise covered the implementation of regularized linear regression and the study of different bias-variance properties.

1 Regularized Linear Regression
   1.1 Visualizing the dataset
   1.2 Regularized linear regression's cost function
   1.3 Regularized linear regression's gradient
   1.4 Fitting linear regression
2 Bias-variance
   2.1 Learning curves
3 Polynomial Regression
   3.1 Learning Polynomial Regression
   3.2 Selecting lambda using a cross validation set

Exercise 6: Support Vector Machines

The exercise covered the implementation of Support Vector Machines (SVM) to build a spam classifier.

1 Support Vector Machines
   1.1 Example Dataset 1
   1.2 SVM with Gaussian Kernels
     1.2.1 Gaussian Kernel
     1.2.2 Example Dataset 2
     1.2.3 Example Dataset 3
2 Spam Classification
   2.1 Preprocessing Emails
   2.2 Extracting Features from Emails
   2.3 Training SVM for Spam Classification
   2.4 Top predictors for Spam

Exercise 7: K-Means Clustering and Principal Component Analysis

The exercise covered the implementation of K-Means Algorithm and its application to image compression and Principal Component Analysis (PCA) to find low-dimensional representation of face images.

1 K-Means Clustering
   1.1 Implementing K-Means
     1.1.1 Finding closest centroids (Cluster Assignment Step)
     1.1.2 Computing centroid means
   1.2 K-means on example dataset
   1.3 Random Initialization
   1.4 Image Compression with K-means
     1.4.1 K-means on pixels
2 Principal Component Analysis
   2.1 Example Dataset
   2.2 Implementing PCA
   2.3 Dimensionality Reduction with PCA
     2.3.1 Projecting the data onto the principal components
     2.3.2 Reconstructing an approximation of the data
     2.3.3 Visualizing the projections
   2.4 Face Image Dataset
     2.4.1 PCA on Faces
     2.4.2 Dimensionality Reduction

Exercise 8: Anomaly Detection and Recommender Systems

The exercise covered an Anomaly Detection algorithm and its application in detecting failing service on a network, and Collaborative Filtering in building a recommender system for movies.

1 Anomaly Detection
   1.1 Gaussian distribution
   1.2 Estimating parameters for a Gaussian
   1.3 Selecting the threshold
   1.4 High dimensional dataset
2 Recommender Systems
   2.1 Movie ratings dataset
   2.2 Collaborative filtering learning algorithm
     2.2.1 Collaborative filtering cost function
     2.2.2 Collaborative filtering gradient
     2.2.3 Regularized cost function
     2.2.4 Regularized gradient

lbleal1/Machine-Learning-Andrew-Ng