The code uses the python libraries numpy, matplotlib, pickle, pandas and scikit-learn. The official documentations of these libraries have been linked.
Dhruvee Birla and myself.
This assignment was done as a part of the Machine, Data and Learning course, Spring 2021.
Task 1 has been answered in the report (report.pdf). The notebook (code.ipynb) contains the remaining tasks. The train and test data used in these tasks are in the data directory. The report also contains observations and conclusions of tasks 2-4.
Understanding Linear Regression, and the method LinearRegression.fit()
.
Resample and train the given data, and calculate the bias and variance of the trained model.
The bias and variance is calculated for the following class of functions:
y = ax + b
y = ax^2 + bx + c
y = ax^3 + bx^2 + cx + d
And so on till polynomial of degree 20.
Tabulating values of irredicible error for the models in Task 2, and observing the changes, if any.
Plotting the graph and evaluating which models are underfit or overfit, and then using this plot to determine the type of train and test data.