MethodsOfMachineLearning/entropy-search

GP.K not positive definite

Closed this issue · 2 comments

Hello!
I regularly get this warning:

 Warning: Inference method failed [Error using chol Matrix must be positive definite.] .. attempting to continue
 In gp (line 120)
   In SEGammaHyperPosterior (line 4)
   In EntropySearch>@(x)in.HyperPrior(x,GP.x,GP.y) (line 197)
   In minimize>f (line 200)
   In minimize>lineSearch (line 136)
   In minimize>BFGS (line 82)
   In minimize (line 56)
(...)

This does not lead to a crash, but seems to be connected to a more severe problem. After a larger amount of iterations (occurred at more than 80) I get these errors, which lead to a crash

an error occured. evaluating function at random location 
Matrix must be positive definite.
(...)
Error using chol
Matrix must be positive definite.

Error in EntropySearch (line 273)
        GP.cK             = chol(GP.K);


Error in GPO (line 32)
result = EntropySearch(in);

The error and crash are reproducible.
My objective function yields costs from 0-300 and my parameter space is 10-dimensional. The covariance function and the hyperparameters are the same as for the ExampleSetup. Thank you for your help.

The most likely cause for this error is that the optimizer has repeatedly evaluated at almost the same location (in terms of kernel length scales), and thus the kernel Gram matrix has become singular to working precision.

This, in turn, is either because the optimization process has actually converged to the extremum, or the hyperparameter optimization has become badly posed and the length scales have been optimized to very large values (check those optimized length scales, and the sequence of evaluation points, to see what's going on). The former case would be nice--you're just done. In the latter case, try changing the in.HyperPrior to enforce sane values (that is, ruling out large length scales).

I understand that having to hunt around in the model like this is frustrating. Unfortunately, it's a fundamental feature of Bayesian optimization (all global optimization, really) that the model can never really be automated away without severely restricting functionality or lowering sample efficiency.

Indeed the optimized length scales were becoming too large, so I will adapt the HyperPrior. Thank you.