Code for House Prices: Advanced Regression Techniques(https://www.kaggle.com/c/house-prices-advanced-regression-techniques)
- Python: 3.6
- numpy(1.14.5)
- pandas(0.23.4)
- scipy(1.1.0)
- matplotlib(2.2.3)
- xgboost(0.80)
- seaborn
- sklearn(>=0.19.2)
(Versions are not limited as stated above, they are just my laptop's configuration. Later versions are likely to work.)
Install project in your local computer. (Install git
before, with pip install git
)
git clone https://github.com/DonaldRR/housePricePred.git
You need to download Datasets from kaggle's House Prices:Advanced Regression website, or using command line (if this works...)
kaggle competitions download -c house-prices-advanced-regression-techniques
By default, you need to make two directories(datasets
and processed_data
) under your project directory CurrentProjectDir
to store data, like
-->CurrentProjectDir
-->datasets
-->processed_data
Of course, you can change directories of datasets in config.py
First, process your original CSVs.
python preprocess.py
Processed files will occur under processed_data
directory
Run specific model to train the data. Arguments are available,
model_type
, 'nn' or 'xgb' available
python run_model.py [model_type [model_type [model_type [...]]]
TODOs
in model.py
are what you need to complete.