Bayesian optimization gives flat latent function?
iRove108 opened this issue · 10 comments
I'm learning a noisy function. I've been getting a flat latent function, and I'm not sure why.
For a self-contained working example, I'm abstracting away a run from my actual application. That is, I'm providing precomputed values for my initial design and my objective function.
Define my function and my initial design:
import numpy as np
import matplotlib.pyplot as plt
from emukit.core import ParameterSpace, ContinuousParameter
from emukit.core.loop import FixedIterationsStoppingCondition, UserFunctionWrapper
from emukit.bayesian_optimization.acquisitions import ExpectedImprovement
from emukit.bayesian_optimization.loops import BayesianOptimizationLoop
from emukit.model_wrappers import GPyModelWrapper
import GPy
from GPy.models import GPRegression
# Define my function, which loops through 30 values in order
# (values precomputed from actually running BayesOpt on my real function)
class SimFunction:
def __init__(self):
self.times_called = 0
self.arr = np.array([151.41014033, 62.09023478 ,62.52426835, 61.8066145, 153.27467322, 61.47440947, 61.66229474 , 62.23729366 , 62.66330217 , 61.37228135, 63.71660287 , 60.80217923, 61.55372866 , 61.85417587, 61.29182696 ,62.01017402 ,63.35483382 ,60.78094205, 62.70740196 ,63.44405674,62.34055014, 61.26645109 , 61.18795096, 62.27117287, 62.60566014,60.67162767, 61.84573099 , 62.86046971, 61.38240327 , 62.15477646])
def __call__(self, x):
value = self.arr[self.times_called]
self.times_called += 1
if self.times_called >= len(self.arr):
self.times_called = 0
return value
sim_function = SimFunction()
f = np.frompyfunc(sim_function, 1, 1)
parameter_space = ParameterSpace([ContinuousParameter('x1', 5, 100)])
# Define my initial design
# (values precomputed from actually running BayesOpt on my real function)
X = np.array([59.33766262, 66.87705975, 9.99274413, 91.77151485,
15.98334223, 75.47932193, 59.4018363 , 28.62138164,
73.61994809, 18.89241105])[:,None]
Y = np.array([41.462845288004075, 45.1335386921359, 40.595514445980804,
55.882947142107895, 19.57217258906048, 49.775813658233105,
41.639729759946434, 25.650601902591085, 48.58427959331652,
20.838451230479237])[:,None]
# Train GP model on initial evaluations
model_gpy = GPRegression(X, Y)
model_gpy.optimize()
model = GPyModelWrapper(model_gpy)
model_gpy.plot()
plt.show()
GP regression on initial points:
Run bayesian optimization loop and display final latent function:
ITER = 30
# initialize bayesian optimization loop
expected_improvement = ExpectedImprovement(model=model)
loop = BayesianOptimizationLoop(model=model,
space=parameter_space,
acquisition=expected_improvement,
batch_size=1)
# run bayesian optimization
loop.run_loop(UserFunctionWrapper(f), FixedIterationsStoppingCondition(ITER))
model_gpy.plot()
plt.show()
The latent function is completely flat! Its parameters also seem out of wack:
Name : GP regression
Objective : 187.15753229070074
Number of Parameters : 3
Number of Optimization Parameters : 3
Updates : True
Parameters:
GP_regression. | value | constraints | priors
rbf.variance | 3769.474111833743 | +ve |
rbf.lengthscale | 2762.2467570215567 | +ve |
Gaussian_noise.variance | 590.833747684825 | +ve |
However, if I fit another GP regression model on the same data points as in the above graph, I get a different function that also seems more reasonable:
# GP regression on experimental points, run separately
model_gpy2 = GPRegression(loop.loop_state.X, loop.loop_state.Y)
model_gpy2.optimize()
model_gpy2.plot()
plt.show()
Name : GP regression
Objective : 95.81686749520499
Number of Parameters : 3
Number of Optimization Parameters : 3
Updates : True
Parameters:
GP_regression. | value | constraints | priors
rbf.variance | 4170.330107829418 | +ve |
rbf.lengthscale | 5.611544854612646 | +ve |
Gaussian_noise.variance | 0.6604288598827055 | +ve |
I believe the two above graphs should be identical. I think the model is not fitting in the optimization loop correctly. I wonder if this has to do with the RuntimeWarning: overflow encountered in expm1
from paramz/transformations.py:111
.
I'd greatly appreciate any ideas on this. Thanks!
Thanks for the report! No idea what's going on, but I've similar behavior recently, and the same warning too. Some googling shows that this might be related to Emukit generating some very big numbers somewhere: https://stackoverflow.com/questions/40726490/overflow-error-in-pythons-numpy-exp-function
Would you be able to get a full stack trace of this warning?
For sure! I was able to get a trace of it below in colab. Seems to be an issue with GPy and paramz.optimize_restarts
. The final warning is thrown here.
This thread looks related. Despite the claim that it is "just a warning," I suspect it still might be causing problems here (particularly given the high parameters reported), but I'm not positive.
File "<ipython-input-4-46631344dbd1>", line 11, in <module>
loop.run_loop(UserFunctionWrapper(f), FixedIterationsStoppingCondition(ITER))
File "/usr/local/lib/python3.7/dist-packages/emukit/core/loop/outer_loop.py", line 91, in run_loop
self._update_models()
File "/usr/local/lib/python3.7/dist-packages/emukit/core/loop/outer_loop.py", line 104, in _update_models
model_updater.update(self.loop_state)
File "/usr/local/lib/python3.7/dist-packages/emukit/core/loop/model_updaters.py", line 62, in update
self.model.optimize()
File "/usr/local/lib/python3.7/dist-packages/emukit/model_wrappers/gpy_model_wrappers.py", line 86, in optimize
self.model.optimize_restarts(self.n_restarts, verbose=verbose, robust=True)
File "/usr/local/lib/python3.7/dist-packages/paramz/model.py", line 152, in optimize_restarts
initial_parameters = self.optimizer_array.copy()
File "/usr/local/lib/python3.7/dist-packages/paramz/core/parameter_core.py", line 87, in optimizer_array
[np.put(self._optimizer_copy_, ind, c.finv(self.param_array[ind])) for c, ind in self.constraints.items() if c != __fixed__]
File "/usr/local/lib/python3.7/dist-packages/paramz/core/parameter_core.py", line 87, in <listcomp>
[np.put(self._optimizer_copy_, ind, c.finv(self.param_array[ind])) for c, ind in self.constraints.items() if c != __fixed__]
File "/usr/local/lib/python3.7/dist-packages/paramz/transformations.py", line 111, in finv
return np.where(f>_lim_val, f, np.log(np.expm1(f)))
/usr/local/lib/python3.7/dist-packages/paramz/transformations.py:111: RuntimeWarning:overflow encountered in expm1
Thanks! What happens if in your second snippet the GPy model is wrapped in the Emukit wrapper? I am suspecting the issue might be because of the fact we don't jsut call optimize
, and instead optimize_restarts(self.n_restarts, verbose=verbose, robust=True)
At first I was thinking that too. I tried replacing the second snippet with:
# GP regression on experimental points, run separately
model_gpy2 = GPRegression(loop.loop_state.X, loop.loop_state.Y)
model_gpy2_emukit = GPyModelWrapper(model_gpy2)
model_gpy2_emukit.optimize()
model_gpy2.plot()
plt.show()
And it worked the same (fit the function properly).
I also tried creating a new GPyModelWrapper
class that calls optimize
instead, but the warning was still thrown with the following traceback:
File "/var/folders/xl/nxhg5qwd3f7b8yjnc80d334h0000gn/T/ipykernel_50332/4090956357.py", line 15, in <module>
loop.run_loop(UserFunctionWrapper(f), FixedIterationsStoppingCondition(MAX_ITER))
File "python3.9/site-packages/emukit/core/loop/outer_loop.py", line 91, in run_loop
self._update_models()
File "python3.9/site-packages/emukit/core/loop/outer_loop.py", line 104, in _update_models
model_updater.update(self.loop_state)
File "python3.9/site-packages/emukit/core/loop/model_updaters.py", line 62, in update
self.model.optimize()
File "gpy_model_wrapper_norestart.py", line 93, in optimize
self.model.optimize()
File "python3.9/site-packages/GPy/core/gp.py", line 675, in optimize
ret = super(GP, self).optimize(optimizer, start, messages, max_iters, ipython_notebook, clear_after_finish, **kwargs)
File "python3.9/site-packages/paramz/model.py", line 98, in optimize
start = self.optimizer_array
File "python3.9/site-packages/paramz/core/parameter_core.py", line 87, in optimizer_array
[np.put(self._optimizer_copy_, ind, c.finv(self.param_array[ind])) for c, ind in self.constraints.items() if c != __fixed__]
File "python3.9/site-packages/paramz/core/parameter_core.py", line 87, in <listcomp>
[np.put(self._optimizer_copy_, ind, c.finv(self.param_array[ind])) for c, ind in self.constraints.items() if c != __fixed__]
File "python3.9/site-packages/paramz/transformations.py", line 111, in finv
return np.where(f>_lim_val, f, np.log(np.expm1(f)))
python3.9/site-packages/paramz/transformations.py:111: RuntimeWarning: overflow encountered in expm1
My guess is that it isn't the model wrapper that's malfunctioning but rather BayesianOptimizationLoop
.
THanks for all this effort. I'll try to carve out time to debug this properly and update here
Ok, I think this is just a modelling issue, and the fact that the error comes from paramz is a hint at that. When you are creating the model, it's just plain GPRegression from GPy, with all default parameters. This includes default prior values for lengthscale, variance and noise, all set to 1 in GPy by default I believe. Usually this isn't a problem, but in your case this is completely out of scale of actual inputs and outputs. When I am playing with different combinations of prior values for these, the result is very different fits.
The simplest thing to try would be to normalize your data to [0,1]. Setting priors might also be worth it, but it's much harder to pick good ones, and probably won't be necessary.
Hi, thank you so much for that—I really appreciate the help! I tried passing normalizer=Standardize()
to GPRegression
and it worked much better. It's no longer giving any warnings as well.
I was still wondering, though: should I expect the graph of the latent function after BayesianOptimizationLoop.run_loop
to match the one after doing GP separately on (loop.loop_state.X, loop.loop_state.Y)
? (as in my original question)
For example, after re-running the process with some more data, the graph displayed following BayesianOptimizationLoop.run_loop
looks like:
While the graph displayed after optimizing a new GPRegression
model (with normalizer=Standardize()
still) looks like:
No, generally they don't have to be the same. Re-training from scratch often yields different parameter values compared to iteratively adding points one by one
Ah ok, thanks for the clarification! I assume differences could come from randomization while optimizing parameters.
Indeed