Boston_housing_regression

This project aims to evaluate the performance of some ML regression algorithms including linear regression, gradient boosting, random forest and KNN to predict the price (target column) in famous dataset "boston housing". loaded from sklearn library.

Required libraries

Numpy
Pandas
Matplotlib
Seaborn
Sklearn

About the dataset

Boston housing dataset is one of the most well known sklearn datasets.

It consists of 506 rows and 13 columns (feature variables) in addition to the target column which is the price of each house (example). For the record, before implementation of any data visualization or modelling codes, all the numeric variables have to be normalized to prevent any propable skewed results towards the column with the largest scale.

project implementation

import necessary libraries
load dataset online (no need to download it)
normalize data
visualize data
split data into training and testing ratios
apply linear regression
apply random forest for regression
apply gradient boosting for regression
apply KNN for regression
evaluate the four models using sklearn metrics
visualize some actual and predicted results

Why this project is noteworthy?

Implementation of this project applying whatever regression algorithm you prefer could be the stepping stone to the world of machine learning especially the regression part. Besides, the simplicity of this dataset will really motivate you to explore more challenges later.

You can findout more about this dataset

boston Housing