OMLRandomBotv2

The current implementation of the Bot follows the following scheme:

  1. Init Bot with a task.id
  2. Draw a learner with probability proportional to its param set dimensions
  3. Draw a random hyperparameter config
  4. Resample sampled learner/hyperpars on the OML Task

Learners

From the old bot

  • xgboost
  • svm
  • kernel knn
  • random forest
  • rpart
  • glmnet

New learners

  • Multinomial Logit (from mxnet?)
  • Cubist
  • fully connected neural networks (mxnet?) up to depth 3 or 4

Worthy Candidates (From Kaggle etc.)

Datasets

Parameter Spaces

See learners.R

Open Questions:

  • Draw a random task inside the bot or obtain it from outside?
  • Divide into big / small datasets and fast / slow learners?
  • Sample according to algo paramset dimensions?
  • Should e.g. xgboost's gbtree and gblinear be sampled with equal probability?
  • How do we do logging of failed jobs?

How do I run the bot?

We currently require a OML task.id for the bot to run

bot = OMLRandomBot$new(11)
bot$run()

Required packages

# Benchmark
library(mlr)
library(batchtools)
library(R6)
library(callr)
library(data.table)
library(ParamHelpers)

# Learners
library(rpart)
library(glmnet)
library(e1071)
library(ranger)
library(xgboost)