document/develop more ways to control exploration-exploitation tradeoff

Question

document/develop more ways to control exploration-exploitation tradeoff

zkurtz opened this issue 6 years ago · 2 comments

Here are ways that I see mlrMBO currently offering control over exploration vs exploitation for single-objective tuning:

The infill criterion offers a discrete set of choices, each of which implies a particular tradeoff
Specifically the cb.lambda parameter offers pretty direct control for the lower confidence bound criterion as in equation (2)
setMBOControlInfill(... interleave.random.points=?) offers a way to inject any approach with some amount of 'pure exploration'.

What other controls exist? Here are some I'd like:

Extend the definition of makeMBOInfillCritEI to accept the cb.lambda parameter too, as a coefficient on the variance term (why not?)
Offer control over the Gaussian process prior of the learner to allow setting a high variance on the prior.
Offer control over the bandwidth of the Gaussian process covariance kernel to be more-or-less permissive of wiggly loss
In case the learner is a random forest, offer controls analogous to (2) and (3).

Answer 1 · 2018-11-29T11:50:14.000Z

Hi @zkurtz,
thanks for your input. I'd like to add:

Recently I added the possibility to implement adaptive (adapting to the process of the optimization) infill criteria. This is an experimental feature and allows you to set certain parameters depending on the progress of the optimization. A concrete example is the Adaptive CB. In one paper it was suggested that it might be beneficial to start with a big value for lambda and go to a smaller one. This is be possible with this feature.

Regarding your suggestions:

Do you have any reference that says that this is a good idea? I stumbled upon the espsilon value here but I have not found the reference yet.
Do you mean the nugget setting? You can already set that when you define the learner manually. lrn = makeLearner("regr.km", nugget = 0.5)
You can also configure the kernel directly using mlr (see above)
Again, this should be all learner settings.

Answer 2 · 2018-11-29T14:02:34.000Z

+1 for the adaptive CB feature.

(1) I don't have a reference.
(2) Yes nugget looks like the thing to start with.

More generally with (2)-(4) I'm not surprised to hear that these are learner settings. Adding a vignette that highlights how to use these settings to influence the exploration-exploitation trade off for the two default learners would be going above-and-beyond, but I imagine it would be very useful.