Hack This Fall 3.0

Project:- House Price Prediction System

The Official Submission of team Sinister Bytes (HTF-059) for Hack This fall 3.0. The Problem statement for the project is based on Data Science. The project modules include Exploratory Data Analysis, Data Preprocessing, Data Analysis and Data Visualization.

Problem Statement

Following the constant downward rally of stocks in the stock market, a gradual shift has been seen in investing patterns of people towards the the Real Estate market. House Price prediction, is important to drive Real Estate efficiency. As earlier, House prices were determined by calculating the acquiring and selling price in a locality. Therefore, the House Price prediction model is very essential in filling the information gap and improve Real Estate efficiency. House prices increase every year, so there is a need for a system to predict house prices in the future. House price prediction can help the developer determine the selling price of a house and can help the customer to arrange the right time to purchase a house. The price of a house are highly effected by various conditions like physical conditions, area, concept and location. With this model, we would be able to better predict the prices.

Technologies Required

This project falls under the category of Machine Learning. This project requires Python and the following Python libraries installed:

NumPy
Pandas
matplotlib
Scikit-learn
Seaborn
Streamlit You will also need to have software installed to run and execute a Jupyter Notebook. Also you will be required to install Power Bi in order to able to visualize various results given below.

Code

The code for the machine learning model used is in HTF House Price Prediction.ipynb jupyter notebook file. You will also be required to include the TransformedHousePrice.csv dataset file to feed it to the machine learning module. While some code has already been implemented to get you started, you will need to execute all the code blocks i to successfully complete the project.The dataset contains 21609 rows and 31 columns. Raw house price contains the unclean data (i.e the dirty data) Note that the code included in HTF.pbix is a Power BI file meant to be used out-of-the-box visualization experience intended to give users a rich experience. If you are interested in how the visualizations are created in the notebook, please feel free to explore this Power BI file.

Project Modules

Exploratory Data Analysis

The main purpose of EDA is to help look at data before making any assumptions. It can help identify obvious errors, as well as better understand patterns within the data, detect outliers or anomalous events, find interesting relations among the variables.

Data Preprocessing

A real-world data generally contains noises, missing values, and maybe in an unusable format which cannot be directly used for machine learning models. Data preprocessing is required tasks for cleaning the data and making it suitable for a machine learning model which also increases the accuracy and efficiency of a machine learning model. Here, the data was cleaned and refined.

Data Analysis

Data analysis is the process of cleaning, changing, and processing raw data and extracting actionable, relevant information that helps businesses make informed decisions. The procedure helps reduce the risks inherent in decision-making by providing useful insights and statistics, often presented in charts, images, tables, and graphs.

Data Visualization

Data visualization is the representation of data through use of common graphics, such as charts, plots, infographics, and even animations. These visual displays of information communicate complex data relationships and data-driven insights in a way that is easy to understand. Here, we have used Power BI for visualization.

Team Members

Ashish Yadav, Kevin Modi, Khushal Ghathalia.

khushal786/Hack-This-Fall-3.0-