alshedivat/keras-gp

Octave returned: error: Two few unique points.

Opened this issue · 4 comments

Hi,

Thanks for your wonderful paper and code. I enjoyed reading the paper and running the code.
I have a question though. When I tried msgp_mlp_kin40k, I got stuck with the error of two few unique points.

My DNN is quite large:

inputs = Input(shape=input_shape)
hidden = Dense(1000, activation='relu', name='dense1')(inputs)
hidden = Dropout(0.5)(hidden)
hidden = Dense(1000, activation='relu', name='dense2')(hidden)
hidden = Dropout(0.5)(hidden)
hidden = Dense(500, activation='relu', name='dense3')(hidden)
hidden = Dropout(0.5)(hidden)
hidden = Dense(50, activation='relu', name='dense4')(hidden)
hidden = Dropout(0.25)(hidden)
hidden = Dense(2, activation='relu', name='dense5')(hidden)
gp = GP(hyp={
'lik': np.log(0.3),
'mean': [],
'cov': [[0.5], [1.0]],
},
inf='infGrid', dlik='dlikGrid',
opt={'cg_maxit': 2000, 'cg_tol': 1e-6},
mean='meanZero', cov='covSEiso',
update_grid=1,
grid_kwargs={'eq': 1, 'k': 70.},
batch_size=batch_size,
nb_train_samples=nb_train_samples)
outputs = [gp(hidden)]
return Model(inputs=inputs, outputs=outputs)

I only got this with this reasonably large-scale DNN. With a smaller-scale DNN I didn't encounter the error at all.

Do you have any idea of how to solve the problem?

Thanks a lot.


Training...
Function evaluation 0; Value 3.462581e+04
Epoch 1/500
8704/10000 [=========================>....] - ETA: 0s - loss: 1.0904 - gp_1_mse: 0.6684 - gp_1_nlml: 34598.8203 - mse: 0.6684 - nlml: 34598.8203/usr/local/lib/python2.7/dist-packages/keras/callbacks.py:405: RuntimeWarning: Can save best model only with val_loss available, skipping.
'skipping.' % (self.monitor), RuntimeWarning)
10000/10000 [==============================] - 1s - loss: 1.0836 - gp_1_mse: 0.6684 - gp_1_nlml: 34598.8203 - mse: 0.6684 - nlml: 34598.8203 - val_mse: 1.2968 - val_nlml: 34598.8218
Function evaluation 0; Value 5.264150e+04
Epoch 2/500
10000/10000 [==============================] - 0s - loss: 1.0001 - gp_1_mse: 0.9989 - gp_1_nlml: 52614.5039 - mse: 0.9989 - nlml: 52614.5039 - val_mse: 0.9924 - val_nlml: 52614.5037
Function evaluation 0; Value 5.264920e+04
Epoch 3/500
10000/10000 [==============================] - 0s - loss: 1.0000 - gp_1_mse: 0.9998 - gp_1_nlml: 52622.2109 - mse: 0.9998 - nlml: 52622.2109 - val_mse: 0.9922 - val_nlml: 52622.2115
Function evaluation 0; Value 5.259500e+04
Epoch 4/500
10000/10000 [==============================] - 0s - loss: 0.9999 - gp_1_mse: 0.9993 - gp_1_nlml: 52568.0078 - mse: 0.9993 - nlml: 52568.0078 - val_mse: 0.9923 - val_nlml: 52568.0072
Traceback (most recent call last):
File "./examples/msgp_mlp_kin40k.py", line 133, in
main()
File "./examples/msgp_mlp_kin40k.py", line 123, in main
epochs=epochs, batch_size=batch_size, verbose=1)
File "/home/hoangcuong2011/Desktop/kgp/kgp/utils/experiment.py", line 67, in train
**fit_kwargs)
File "/home/hoangcuong2011/Desktop/kgp/kgp/models.py", line 128, in fit
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1430, in fit
initial_epoch=initial_epoch)
File "/home/hoangcuong2011/Desktop/kgp/kgp/tweaks.py", line 98, in _fit_loop
callbacks.on_epoch_begin(epoch)
File "/usr/local/lib/python2.7/dist-packages/keras/callbacks.py", line 63, in on_epoch_begin
callback.on_epoch_begin(epoch, logs)
File "/home/hoangcuong2011/Desktop/kgp/kgp/callbacks.py", line 58, in on_epoch_begin
gp.backend.update_grid('tr', verbose=self.verbose)
File "/home/hoangcuong2011/Desktop/kgp/kgp/backend/gpml.py", line 157, in update_grid
self.eng.eval(_gp_create_grid.format(**self.config), verbose=verbose)
File "/home/hoangcuong2011/Desktop/kgp/kgp/backend/engines.py", line 99, in eval
self._eng.eval(expr, verbose=verbose)
File "/usr/local/lib/python2.7/dist-packages/oct2py/core.py", line 304, in eval
out_file=self._reader.out_file)
File "/usr/local/lib/python2.7/dist-packages/oct2py/core.py", line 625, in evaluate
raise Oct2PyError(msg)
oct2py.utils.Oct2PyError: Oct2Py tried to run:
"""

xg = covGrid('create', X_tr, eq, k);

"""
Octave returned:
error: Two few unique points.
error: called from
apxGrid>creategrid at line 395 column 46
apxGrid at line 169 column 5
covGrid at line 8 column 44

Ah I forgot to mention that I got the error only when I initialized DKL's parameters using the network weights that I learned from a DNN:

model.load_weights('checkpoints/mlp_kin40k.h5', by_name=True)

If I don't do that, the training does not encounter the error at all. But of course training the whole model from scratch is not a good idea, as DKL usually provides a much worse performance without smart initialization.

Best,

Hi @hoangcuong2011
I have also ran into this problem and it seems to only appear whenever my layer have >=500 nodes I wonder if you ever figured out a work around for this? Thanks in advance.

Hi,
I almost forget everything as it was already 1 year and a half since then.
But I recall that there was no workaround indeed (even with extensive googling). I had to give up with this as far as I remembered.

But If you happen to come up with any solution, please let me know as well.

Many thx!

Thanks! I will let you know if I learned anything