/proj-predict-house-prices

ML house pricing

GNU General Public License v3.0GPL-3.0

Predicting House Prices using Machine Learning

The link to the project website is attached below.

Project background

The real estate market has a great impact in people’s lives and in the economy of cities like São Carlos which concentrates a great variety of infrastructure and populations of all incomes. The city’s strategic location, booming industries, and renowned educational institutions have attracted a surge of potential homebuyers and investors. However, with the increasing complexity and variability of house prices in São Carlos, renting or buying a house has become a difficult task that usually involves a lot of fraud, negotiating deals, researching the local areas and so on. Therefore, there is a growing need for accurate and reliable machine learning models to predict property values, aiding buyers, sellers, and real estate professionals in making informed decisions.

The problem

A Machine Learning based solution can be useful to accurately forecast housing prices in different neighborhoods of São Carlos. By leveraging historical data, socioeconomic factors, and advanced algorithms, this project aims to provide a tool for people to navigate the complex real estate market in São Carlos. The accurate prediction of house prices in São Carlos is crucial for various stakeholders, including homebuyers, sellers, real estate agents, and investors. Reliable price predictions can help buyers make informed decisions about their investments, assist sellers in setting competitive prices, and enable real estate professionals to provide better guidance to their clients.

Project goals

The specific goals and deliverables are:

  • Create a comprehensive dataset by collecting and preprocessing real estate data from reliable sources, ensuring data quality and integrity.
  • Develop a robust machine learning model capable of accurately predicting house prices based on a variety of relevant features such as location, size, number of rooms, amenities, and historical sales data.
  • Evaluate and optimize the model's performance by employing various techniques such as feature engineering, model selection, hyperparameter tuning, and cross-validation.
  • Build an interactive web application that allows users to input the details of a house and obtain an estimated price prediction from the trained machine learning model.
  • Generate detailed documentation that outlines the project methodology, data preprocessing steps, model architecture, and any additional insights or findings discovered during the project, providing a clear roadmap for reproducibility and future enhancements.

Project Structure

├── LICENSE
├── README.md          <- The top-level README for developers/collaborators using this project.
│ 
│
├── reports            <- Folder containing the final reports/results of this project
│   └── README.md      <- Details about final reports and analysis
│ 
│   
├── src                <- Source code folder for this project
    │
    ├── data           <- Datasets used and collected for this project
    │   
    ├── docs           <- Folder for Task documentations, Meeting Presentations and task Workflow Documents and Diagrams.
    │
    ├── references     <- Data dictionaries, manuals, and all other explanatory references used.
    │
    ├── results        <- Folder to store Final analysis and modelling results and code.
    │
    ├── tasks          <- Master folder for all individual task folders
    │
    └── visualizations <- Code and Visualization dashboards generated for the project

Folder Overview

  • Reports - Folder to store all Final Reports of this project
  • Data - Folder to Store all the data collected and used for this project
  • Docs - Folder for Task documentations, Meeting Presentations and task Workflow Documents and Diagrams.
  • References - Folder to store any referneced code/research papers and other useful documents used for this project
  • Results - Folder to store final analysis modelling results for the project.
  • Tasks - Master folder for all tasks
    • All Task Folder names should follow specific naming convention
    • All Task folder names should be in chronologial order (from 1 to n)
    • All Task folders should have a README.md file with task Details and task goals along with an info table containing all code/notebook files with their links and information
    • Update the task-table whenever a task is created and explain the purpose and goals of the task to others.
  • Visualization - Folder to store dashboards, analysis and visualization reports