Model evaluation mini-project

Aim: Develop a program to evaluate the performance of several supervised models on regression datasets

In this repo we look at applying a variety of SupervisedLearning models to predict a variety of regression data-sets.


I used this dataset from Kaggle for car prices: (save it in the root directory as "car_data.csv")

I used this dataset from Kaggle for test_scprse: (save it in the root directory as "test_scores.csv")

How to run the project

Run the project by running:

$ python3

Install dependencies with:

$ pip3 install -r requirements.txt


The following models were run on the dataset:

  • K-Nearest-Neighbours
  • Linear Regression
  • Decision Tree
  • Random Forest
  • SVR

Evaluates the performance of 5 key models:

  • it should evaluate the performance on the validation set ✅

  • it should return a train, val and test loss value and R-squared score for that hyperparameterisation ✅

  • it should return the hyperparameters which resulted in that score ✅

  • the time taken to fit the model ✅

  • evaluate them on all of sklearn's toy regression datasets available in sklearn.datasets ✅

  • create a file which loops through each dataset and each model, printing the results ✅

  • graphical visualisations of

  • time to fit each of the best models ✅

  • final train, validation and test set loss/mse scores ✅

  • final train, validation and test set R-squared scores (= model.score) ✅