This is the repository accompanying the paper Neural Networks for Predicting Algorithm Runtime Distributions.
@proceedings{eggensperger-ijcai18,
author = {K. Eggensperger and M. Lindauer and H. Hoos},
title = {Neural Networks for Predicting Algorithm Runtime Distributions},
booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI'18)},
year = {2018}
}
It includes scripts and notebooks for running the experiments shown in the paper. This code has been written and tested with Python 3.5; all dependencies are listed in requirements.txt
The data used to train the networks can be found here.
After downloading, please put the content in ./data/
.
- Create predictions using
eval_model.py
This script trains different models (DistNet, multi-output RFs, independent RFs) on different distribution families (inverse Gaussian, Lognormal, Exponential) using crossvalidation. Running the script will train either a DistNet or both forest-based models on one fold for a given distribution type and scenario. The predictions will be stored using pickle.
Also see: python eval_model.py
For example:
python eval_model.py --model lognormal_distfit.floc --scenario clasp_factoring --fold 0 --seed 100 --save ./TEST_100 --num_train_samples 100
python eval_model.py --model lognormal_nn.floc --scenario clasp_factoring --fold 0 --seed 100 --save ./TEST_100 --num_train_samples 100
NOTE: To perform the full crossvalidation and the reproduce the results from the paper you need to train each model on folds [0, 1, ..., 9]
using seeds [100, 200, ..., 1000]
for each distribution and number of training samples [1, 2, 4, 8, 16, 32, 64, 100]
.
- Analyse results using one of the two jupyter-notebooks provided in
/notebooks/
CreateTable_evalModel-MultiSeed
Creates a table with average NLLHs for each scenario and model
PlotSubsets_evalModel-MultiSeed
Creates plots that show average NLLHs compared to the number of observations per instance used for training the model.
On how to train DistNets and preprocess data, please have a look at the script eval_model.py
.
Also, please have a look at the other notebooks which provide further options to visualize and analyze runtime data.