Linear regression on population data. You calculate a cost function and gradient descent function to calculate the coefficients. You see how the cost function varies with the fit coefficients. You end up making a plot of your prediction.
Binary logistic regression on exam data. Calculate a sigmoid function (used for cost), cost function, predict function, and regulatized cost function. fminunc is used to calculate fit variables
There was a bonus regularization exercise. They create a nonlinear (polynomial) model with some number of fit coefficients. You write the cost function and gradient discent for them. The
First, looking at high bias (underfitting) and high variance (overfitting) trade off. You plot some data (single predicter and response). The data is clearly non-linear. You plot a straight line to it (you are underfitting). It's crap. You examine bias-variance tradeoff. To do this, you plot a learning curve. The learning curve plots the training and validation set errors as a function of training set size. When computing the fit parameters, you do it with the n-subset of parameters and calculate the RSS errors. For the cross validation set, however, you use the fit parameters and calculate the errors over the entire validation data set. The error for the training and validation converge quickly and are high, suggesting large bias (more data doesn't help).
Next, you do this for polynomial regresion. You start with a high degree polynomial to fit the model (you are overfitting). You create a learning curve and see that there is a gap between train and validation curve, suggesting large variance.
Finally, you vary the regularization parameter. You plot the training and validation error vs the regularization parameter
Use SVM to classify data first using a linear kernal. You can do regularization with SVM using
Next, you do a spam classifier. You parse an email and turn the words in indicies of a vocabulary list. Your feature vector is zeros/ones of words that are in the vocabulary list.