Train, test split on OpenML data

Question

Train, test split on OpenML data

pplonski opened this issue 5 years ago · 2 comments

I would like to run my AutoML package on the same train/test split of OpenML datasets. Could you please give some instructions on how to obtain exact split for train/test?

Answer 1 · 2020-04-10T18:56:54.000Z

Hi @pplonski,

The easiest way to run the OpenML datasets would be to use automlbenchmark as done in our paper: https://github.com/openml/automlbenchmark

By running through automlbenchmark, it will automatically get the same train/test splits as in our paper, as they are predefined.

All you need to do is create a framework wrapper for your AutoML package in automlbenchmark, and it should work for you.

Here is the issue open regarding adding AutoGluon officially to automlbenchmark, which you could consider going a similar path: openml/automlbenchmark#93

Hope this helps!

Answer 2 · 2020-04-11T04:13:52.000Z

Thank you Nick!