mathematics-of-machine-learning-pca: A Jupyter Notebook repository from jessxphil

NOTE: This repository is for learning purposes only. Please follow the Coursera honor code. I've posted the answers here with the intent that it helps with debugging your own code. I encourage you to utilize the discussion forums available via Coursera and use this repo to understand why your program isn't working as expected. Best of luck!

Lectures

Principal Component Analysis is a form of dimensionality reduction. It analyses and then exploits the structure of the data and the correlations between the different variables within the data set. The key goal of PCA is to achieve a more compact model with lower dimensions without losing vital information in the data set.

Principal Component Analysis (PCA) is one of the most important dimensionality reduction algorithms in machine learning. In this course, we lay the mathematical foundations to derive and understand PCA from a geometric point of view. In this module, we learn how to summarize datasets (e.g., images) using basic statistics, such as the mean and the variance. We also look at properties of the mean and the variance when we shift or scale the original data set. We will provide mathematical intuition as well as the skills to derive the results. We will also implement our results in code (jupyter notebooks), which will allow us to practice our mathematical understand to compute averages of image data sets.

Objectives

Week 1: Interpret the effects of linear transformations on means and (co)variances. Compute means/variances of linearly transformed data sets. Write code that represents images as vectors and computes basic statistics of datasets.

Week 2: Introduce and practice the concept of an inner product, which will allow us to talk about geometric concepts in vector spaces. More specifically, we will start with the dot product, as a special case of an inner product, and then move toward a more general concept of an inner product, which play an integral part in some areas of machine learning, such as kernel machines (this includes support vector machines and Gaussian processes).

Week 3: Look at orthogonal projections of vectors, which live in a high-dimensional vector space, onto lower-dimensional subspaces. This will play an important role in the next module when we derive PCA.

Week 4: Think of dimensionality reduction as a way of compressing data with some loss, similar to jpg or mp3. Principal Component Analysis (PCA) is one of the most fundamental dimensionality reduction techniques that are used in machine learning.

Assignments

NumPy (Tutorial)
Mean/ CoVar of Dataset (Solution)
Inner Products and Angles (Solution)
Back Propogation (Solution)
Fitting the Distribution of Heights Data (Solution)

jessxphil/mathematics-of-machine-learning-pca

Table of Contents

Lectures

Objectives

Assignments