The Death of Data Scientists - Will AutomML Replace Data Scientists?

This is the code to replicate the human data scientist's results of this post on Medium: https://medium.com/@leapingclam/the-death-of-data-scientists-c243ae167701

Speed Dating data set

  • use file SpeedDating_preprocessing_models.ipynb for data preprocessing, feature engineering, and modeling (note that we re-tuned our local models and improved the performance after the presentation, thus the outcome is different from the blob).
  • data: Speed_Dating_Clean.csv

ASHRAE data set

  • use file ASHRAE_preprocessing.ipynb to replicate data preprocessing and feature engineering.
  • use file ASHRAE_models.ipynb to train models.
  • please download the data set from: https://www.kaggle.com/c/ashrae-energy-prediction/data
  • note: the file ASHRAE_weather.ipynb is attached for reference only. It contains the EDA on weather data and different approaches we tried to interpolate missing values. It is not required to run this file to replicate the outcome of our blog.

Contributors