facebook/Ax

How to use custom Gaussian prior for Bayesian optimization?

davidhaslacher opened this issue ยท 5 comments

In some cases, it would be advantageous to use domain knowledge in the form of a Gaussian process prior for Bayesian optimization. If I know that some parameter values are likely to result in better outcomes, it would help to encode this knowledge in a prior distribution over functions. How can I do this in Ax or BoTorch?

What form would the prior take? Would it itself be a GP prior (e.g. in the context of transfer learning or multi-task learning)? Or is it a user-specified functional form? There are potentially different ways to doing this; e.g. a simple one would be to swap out the prior mean function of the GP model. There are multiple papers that do this, are you thinking of a particular reference?

Except for the transfer learning type setup where the prior is coming from some other GP model, these customizations are currently not straightforwardly supported in Ax, and it would be easier to do this through BoTorch.

Thanks for getting back to me so quickly.

I will have hundreds of subjects, for each of which I will perform vanilla Bayesian optimization for N < 30 trials.

I would then like to use the information contained in this database of trained models as a prior for any future optimization problems. The Mean Hierarchical GP (MHGP) would probably be suitable, due to complexity considerations. The MHGP uses a hierarchical kernel, to the effect that the posterior mean of the trained models would be used as a prior mean for any new optimization problem. Would something like this be possible in Ax?

Are you aware of any examples/tutorials on transfer learning in Ax or BoTorch? I only found the one on the Rank-Weighted GP Ensemble

Yeah the MHGP will be straightforward to implement in BoTorch. We have also been looking into transfer learning approaches more recently and have tried out some other models as well.

As to hooking these things into Ax: It's definitely possible, the main challenge with this is really to pass around all the metadata and information through the whole stack. We're thinking about how to expose things in Ax in a way that's reasonably concise but yet customizable, which is not easy.

One simple option would be to use the Modular BoTorch Model interface to pass in a new custom BoTorch class together with pre-fitted https://github.com/facebook/Ax/blob/main/tutorials/modular_botax.ipynb. There are some pitfalls here though, namely that the data the model gets in the Ax stack typically lives in a transformed space (normalized and standardized), so that's something that one would have to take into account.

cc @qingfeng1, @saitcakmak, and @dme65 - any thoughts on when some of the things you've been looking into may end up in OSS (botorch or Ax)?

Some related issues:

... e.g. a simple one would be to swap out the prior mean function of the GP model. ...(#1647 (comment))

I think https://github.com/ziatdinovmax/gpax is along these lines. This type of approach seems to be increasingly popular in the physical sciences (e.g., at a recent UQ Workshop, cc @andrewgordonwilson)

Putting this on our wishlist for now.