mlr-org/mlrMBO

regr.km becomes extremely slow after 800 runs

RPdavies opened this issue · 3 comments

For moderately high dimensions (~15) the construction of the surrogate using "regr.km" with the default covtype = "matern3_2" can take much more time than evaluating the objective function itself after the first 800 or so iterations (in our case over a hundred-fold slower).

Are there any guidelines for selecting the appropriate covtype and range hyperparameter, in order to adjust the trade-off between speed and accuracy, for cases like this?

Thanks in advance

No. There is no guideline and I don't know of any tricks that will speed up this process. In the future we are thinking about connecting other Kriging/GP implementations (e.g. gpytorch) that are faster but at the moment nobody of us has the time for that.

OK. Thanks, Jakob! One method I am trying is to use multipoint proposal with a criterion such as lower confidence bound ("cb") which doesn't require re-evaluating the surrogate for each proposal. That way we get more points for each evaluation at least, though I'm not sure if that's a false economy.

I'll keep an eye out for the future Kriging implementations, thanks again!

You are always welcome to implement those as mlr regression learners so that they can be directly used from within mlrMBO 😉

Multipoint is a valid approach for this scenario. In the end it is always a complicated choice where you have to balance the budget of the Kriging vs. the budget of feature evaluations. In the end I would also observe the learning curve and also the proposed x values if further iterations even improve the y or explore new areas.

I will close this issue as there is nothing to be done in mlrMBO (except maybe writing a guideline which is very hard and out of scope for pure package documentation). If you request a certain regression learner you can do a Issue/PR in mlr.