- Developer: Penny Sun
- Product Owner: Jill Fan
- QA: Logan Wilson
-
Vision: Build a house price estimate application so that consumers can get easy access to home value information by entering a list of criteria, and build up a trusted marketplaces for real estate information in the U.S.
-
Mission: Based on statistical and machine learning models that analyze large set of data points on each property. and continually improving the RMSE prediction error. This will be done using a random forest model or a XGBoost model that is trained with the Kaggle Iowa State House Sale Dataset.
-
SuccessCriteria: Successfully deployed a web application that dynamically shows a house price estimate according to user input.
-
Clone repository.
-
Use conda to create virtual environment
> House-Estimate$ conda create -n houseproject python=3 > House-Estimate$ source activate houseproject
-
Install required packages
> (houseproject) House-Estimate$ pip install -r requirements.txt
-
Set up house.env file with the following structure
export DATABASE_URL=XXX export HOST=XXX export USER=XXX export PASSWORD=XXX export DBNAME=XXX export PORT=XXX
-
Set environment variables from file
source house.env
-
(OPTIONAL) If you want to run unit tests before running the code, run the following commands:
> (houseproject) House-Estimate$ py.test
-
Download training data from [Kaggle] (https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data) as train.csv.
-
Launch the application
> (houseproject) House-Estimate$ python application.py
For this project, we used Pivotal Tracker, an Agile Project Management Software, to keep track of the overall progress. The pivotal tracker page for this project can be reached by clicking on this link.
-
EDA
: Jupyter Notebook that contains a walkthrough of the EDA. [jupyter notebook] -
Random Forest Model
: A walkthrough of the overall random model building process. [jupyter notebook] -
XGBoost Model
: An old version of the overall model building process. [jupyter notebook] -
Step by step guide for database, environment and sphinx documentation set up. [Github Wiki]
-
You can find the slides for this project here.