This is the companion code for the blogpost on model consistency
- Clone this repo
- cd into the cloned local copy
- Generate data:
python prepare_data_set.py
- Run analysis
./control_freak.sh
from Terminal
This will run a classification example, looping through available versions of scikit-learn, xgboost, catboost, h2o and lightgbm. The output will be saved to results_clf.txt
.
- nix* and OSX only :)
- By default
./control_freak.sh
will call therun_models.py
file which runs the classification analysis. Change it torun_models_reg.py
if you want to run the regression analysis instead. - By default
./control_freak.sh
will save the output toresults_clf.txt
. Change the name in./control_freak.sh
as needed when running the regression analysis, e.g. by commenting inecho "library,version,f1_score,timeit" > results_reg.txt
- Requirements: We're looping through tons of old versions. See
./control_freak.sh
for details. - Warning: If you test very old versions of scikit-learn,
train_test_split
may not be available. Generate the data before running the analysis to avoid this problem.