FDS2023Fall_Practical01 (Deadline: 05/11/2023)

Authors

Tianjiao Liu, Yating Pan

Implementation of Linear Regression (Ridge, Lasso)

Tasks

  • Implement the learning task using numpy
  • Plot learning curves to understand model over-/underfitting
  • Linear models with polynomial basis expansions and regularisation (Ridge and Lasso) using scikit-learn Optional: K-fold validation to set hyper-parameters (5 bonus points)

Approximate number of lines of code

  • Size of provided skeleton code to start with: 160 lines
  • Size of your code: 50 lines

Details

In this practical, you will implement the linear regression model using the least squares method. In the first part, the linear model will be implemented from scratch using the numpy package, and you will need to use learning curves plot to understand whether the linear model is overfitting or underfitting. In the more advanced part, the task is to implement linear models with polynomial basis expansions and regularizations (Ridge and Lasso) by making use of the scikit-learn library. The optional task is to use k-fold cross-validation to obtain the optimal hyper-parameters for the models. You can get up to 5 bonus points for doing the optional task. The practical will use the winequality dataset, which is available here: https://archive.ics.uci.edu/ml/datasets/Wine+Quality