/House_Price_prediction

Analysis and Model evaluation of House price in India.

Primary LanguageJupyter Notebook

Indian House Price Prediction

A regression based project for predicting house 🏠 price💲.

Project objective

The purpose of this project is to study the relationship between different factors(dependent variables) that influence the price(target variable) of houses in India and predict house price using different Machine Learning algorithms.

Technology and libraries used

  • Python
  • matplotlib
  • seaborn
  • numpy
  • pandas
  • scikit-learn
  • scipy
  • etc.

Overview

This is an Indian house price predicting regression based project.

Data source

Dataset can be downloaded from kaggle

⚡Exploratory data analysis

  • First the data set is checked for any missing or null values. If there's any any, then they are either dropped or replaced with mean or median as per analysis.

  • We can see that of all the features, Area has maximum influence on the target variable i.e price.

  • Then we analyse the categorical variables.

  • Then we remove the outliers.
  • Visualizing the area feature using boxplot.

⚡Feature engineering

  • To select the fetures that influence the target variable, corr() command is used to look into the relationship of the dependent variables with the target variable.
  • And heatmap is used for proper visual representation.¶ correlation

⚡Data preprocessing

  • The features that has the maximum influence on the target variable (price) are considered and rest all are dropped.
  • Then the outliers are removed using IQR(Interquantile Range) method.
  • The dataset is then scaled using StandardScaler.
  • And dataset is split into train and test sets using train_test_split method.

⚡Model training and evaluation

  • Using sklearn to train the model on the training data set and testing on the test data set.
  • Metrics used: RMSE for evaluation of the model.