mlr-org/mlrMBO

Bug: small batch size with categorical variables

rajeeja opened this issue · 6 comments

The link below is a standalone script for replicating the error to file the bug fix with mlrMBO

https://github.com/rajeeja/mlrmbo-bug/blob/master/mlrMBOMixedIntegerTest11a.R

Please let me know if you need more details.

Hi,
you are using the initial design in a weird way. It is simply too small for your big search space.

Why do you generate the design with max.budget points to then only take the first 5 (propose.points).

Your initial design has to contain each discrete value at least once so that the surrogate can make predictions.

For me it works with design = generateDesign(n = 30, par.set = getParamSet(obj.fun))

@jakob-r Thanks!
But "Your initial design has to contain each discrete value at least once so that the surrogate can make predictions." is not sufficient if I use the learner below:

surr.rf = makeLearner("regr.randomForest",
predict.type = "se",
fix.factors.prediction = TRUE,
se.method = "bootstrap",
se.boot = 2)

res = mbo(obj.fun, design = design, learner = surr.rf, control = ctrl, show.info = TRUE)

Complete isolated example is here
https://github.com/rajeeja/mlrmbo-bug/blob/master/learner-discrete-param-bug.R

True, my answer is kind of restricted to the surrogate. However, I have doubts that the surrogate will work so well, especially the uncertainty estimation for unknown factors. I am curious to see results of any optimization benchmark using this approach 🙂

Even if I increase the propose.points to 1000, I get the error:
Error in predict.randomForest(getLearnerModel(x), newdata = .newdata, :
New factor levels not present in the training data

for this example: https://github.com/rajeeja/mlrmbo-bug/blob/master/learner-discrete-param-bug.R

What should be a fix for getting something like this to work?

changing surr.rf = makeLearner("regr.randomForest", 
                      predict.type = "se", 
                      fix.factors.prediction = TRUE,
                      se.method = "bootstrap", 
                      se.boot = 8)

to

surr.rf = makeLearner("regr.randomForest", 
                      predict.type = "se", 
                      fix.factors.prediction = TRUE,
)

it works. I'll update you about results from this approach. Also older version works even with se->

just found that changing the se.method = "bootstrap", to

se.method = "jackknife",

works.