This is the code to replicate the human data scientist's results of this post on Medium: https://medium.com/@leapingclam/the-death-of-data-scientists-c243ae167701
- use file SpeedDating_preprocessing_models.ipynb for data preprocessing, feature engineering, and modeling (note that we re-tuned our local models and improved the performance after the presentation, thus the outcome is different from the blob).
- data: Speed_Dating_Clean.csv
- use file ASHRAE_preprocessing.ipynb to replicate data preprocessing and feature engineering.
- use file ASHRAE_models.ipynb to train models.
- please download the data set from: https://www.kaggle.com/c/ashrae-energy-prediction/data
- note: the file ASHRAE_weather.ipynb is attached for reference only. It contains the EDA on weather data and different approaches we tried to interpolate missing values. It is not required to run this file to replicate the outcome of our blog.
- Joseph Chin, UT Austin MSBA ’20: joseph.chin@utexas.edu
- Aifaz Gowani, UT Austin MSBA ’20: aifazg92@gmail.com
- Gabriel James, UT Austin MSBA ’20: gabejames@me.com
- Matthew Peng, UT Austin MSBA ’20: matthew.peng@utexas.edu