/house-sales-data-analysis

This project is to exercise my skills in python for data analysis

Primary LanguageJupyter NotebookMIT LicenseMIT

caspar-camille-rubin-fPkvU7RDmCo-unsplash

House Sales Data Analysis

This is a study project using fictional data from Kaggle, available in http://kaggle.com/harlfoxem/housesalesprediction

📚 1 - Exercises included

Some examples available in the notebook:

  • 1.1 Basic Operations Tasks:
    • How many houses are available for purchase?
    • What is the most expensive house?
    • What is the average house price for homes with 2 bathrooms?
    • From the houses that have living rooms over 300 square meters, how many have more than 2 bathrooms?
  • 1.2 Data Manipulation Tasks:
    • Create a new column called dormitory type
    • Change the data type of 'yr_renovated' to DATE
    • What is the earliest renovation in the dataset?
    • How many houses are "good" and considered "new_houses"?
  • 1.3 Data Structure Tasks:
    • Create bars graph for the sum of prices by number of bedrooms
    • Create line graph for average price by built year
    • Create bars graph for average price by dormitory type
    • Create a Dashboard with the previous 3 graphs.
  • 1.4 Control Structures Tasks:
    • Add information in the dataset using API requests.
    • Create a Map view with filters.
    • Create dashboard views with filters.

🛠 2 - Tools used

  • jupyter notebook
  • pandas
  • numpy
  • matplotlib
  • seaborn
  • geopy
  • requests
  • multiprocessing
  • plotly
  • ipywidgets

🚀 3 - Next Steps

  • Use the project data to find valuable business insights.
  • Publish the findings using Streamlit and Heroku.
  • Organize the code using functions.