This documents explains how to use the provided scripts to generate the tables / figures in my AutoML benchmark report (https://www.overleaf.com/9694794sjksvwspnnfc). scores_of_simulation_results

NOTE:

  1. Some folders / scripts are too large to be uploaded here, please check Google Drive for complementary files.

  2. Ready-to-use files:
    2.1) global_csv.csv: contains all info you need for systematical analysis. (See README_global_csv.txt)

    2.2) scores_of_simulation_results.zip: contains all scores used in 2.1)

    2.3) all_datasets_in_metafeatures.zip: contains all meta-feature files used in 2.1)

    2.2) simulation_results/res/: all predictions (pure methods, winner methods) are saved here. They are uploaded in small chunks to respect GitHub max file size. Please unzip them to have the following dir structure: simulation_results/res/aad_freiburg, simulation_results/res/lise_sun (unzip lise_sun1.zip and lise_sun2.zip to lise_sun/), imulation_results/res/abhishek (unzip abhishek1.zip and abhishek2.zip to abhishek/), simulation_results/res/basic_model_autosklearn_tuned, etc…

===========================More info========================================
I) Performance of winner solutions on all 30 datasets
This table reports the task-specific scores of winner solutions applied on all 30 datasets.

I.1) The predictions are saved in simulation_results/res/. To reproduce them, you can unzip winner_submission.zip and go to a certain winner submission folder, execute run.py either under env "Codalab-AutoML-env" (see "create_Codalab-AutoML-env_on_tipi.txt" for instructions) or using the docker image "lisesun/codalab_automl2016:3.0" I created for this purpose (https://cloud.docker.com/u/lisesun/repository/docker/lisesun/codalab_automl2016). The predictions of basic models using default HP and auto-sklearn-tuned HP can be respectively reproduced using the 'lisesun/codalab_automl2016:3.0' and 'mfeurer/auto-sklearn' (https://hub.docker.com/r/mfeurer/auto-sklearn) images

I.2) The scores are saved in scores_of_simulation_results.
To reproduce these scores, edit scoring_program/score.py so that it points to the right prediction files (see instructions in score.py for more detail) and execute it.

I.3) For aad_freiburg (i.e. auto-sklearn), there are 2 scores: one achieved within the task-specific time budget, the other achieved with much longer learning process. The corresponding prediction files are stored in simulation_results/res/aad_freiburg/aad_freiburg_timebudget/ and simulation_results/res/aad_freiburg/aad_freiburg_global/

II) Error bars scores_of_simulation_results

Error bars are computed using scoring_program/error_bar.py: resample without replacement 10% subset of test set, repeat 100 times.

III) Learning curve:

Generated by learning_curves/plot_learning_curve.py: by default uses log time, can set to normal time.

All curves shown in the report are stored in learning_curves/LearningCurvePng/.

IV) Pure methods with default hp