secondmind-labs/trieste

keras model cannot be saved with tensorboard callback

Closed this issue · 0 comments

if we add a following standard callback:

tf.keras.callbacks.TensorBoard(
                    log_dir=".",
                    histogram_freq=0,
                    update_freq="epoch",
                    write_graph=False,
                    write_images=False,
                    embeddings_freq=0,
                )

to fit_args state cannot be saved, an error occurs:

E                       NotImplementedError: Failed to save the optimization state. Some models do not support deecopying or serialization and cannot be saved. (This is particularly common for deep neural network models, though some of the model wrappers accept a model closure as a workaround.) For these models, the `track_state`` argument of the :meth:`~trieste.bayesian_optimizer.BayesianOptimizer.optimize` method should be set to `False`. This means that only the final model will be available.

../../trieste/bayesian_optimizer.py:668: NotImplementedError
------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------
2022-07-31 22:37:26.456257: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2022-07-31 22:37:26.456310: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2022-07-31 22:37:26.456357: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2022-07-31 22:37:26.457966: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2022-07-31 22:37:26.554112: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2022-07-31 22:37:26.554592: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 1999965000 Hz
2022-07-31 22:37:27.802835: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2022-07-31 22:37:27.802877: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2022-07-31 22:37:27.805983: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
2022-07-31 22:37:27.808525: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2022-07-31 22:37:27.812080: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: ./train/plugins/profile/2022_07_31_22_37_27
2022-07-31 22:37:27.814008: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for trace.json.gz to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.trace.json.gz
2022-07-31 22:37:27.818018: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: ./train/plugins/profile/2022_07_31_22_37_27
2022-07-31 22:37:27.818067: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for memory_profile.json.gz to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.memory_profile.json.gz
2022-07-31 22:37:27.818316: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: ./train/plugins/profile/2022_07_31_22_37_27Dumped tool data for xplane.pb to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.xplane.pb
Dumped tool data for overview_page.pb to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.overview_page.pb
Dumped tool data for input_pipeline.pb to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to ./train/plugins/profile/2022_07_31_22_37_27/hrvojes-lptp.kernel_stats.pb

2022-07-31 22:37:30.901303: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2022-07-31 22:37:30.901355: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2022-07-31 22:37:31.141108: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
2022-07-31 22:37:31.146522: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2022-07-31 22:37:31.153055: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: ./train/plugins/profile/2022_07_31_22_37_31
2022-07-31 22:37:31.156888: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for trace.json.gz to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.trace.json.gz
2022-07-31 22:37:31.164615: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: ./train/plugins/profile/2022_07_31_22_37_31
2022-07-31 22:37:31.164730: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for memory_profile.json.gz to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.memory_profile.json.gz
2022-07-31 22:37:31.165329: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: ./train/plugins/profile/2022_07_31_22_37_31Dumped tool data for xplane.pb to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.xplane.pb
Dumped tool data for overview_page.pb to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.overview_page.pb
Dumped tool data for input_pipeline.pb to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to ./train/plugins/profile/2022_07_31_22_37_31/hrvojes-lptp.kernel_stats.pb

2022-07-31 22:37:31.718234: E tensorflow/core/kernels/logging_ops.cc:174] 
Optimization failed at step 1, encountered error with traceback:
Traceback (most recent call last):
  File "/home/hrvoje.stojic/code/trieste/trieste/bayesian_optimizer.py", line 657, in optimize
    history.append(record.save(track_path / file_name))
  File "/home/hrvoje.stojic/code/trieste/trieste/bayesian_optimizer.py", line 114, in save
    dill.dump(self, f, dill.HIGHEST_PROTOCOL)
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 276, in dump
    Pickler(file, protocol, **_kwds).dump(obj)
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 498, in dump
    StockPickler.dump(self, obj)
  File "/usr/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/usr/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 885, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 890, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "/home/hrvoje.stojic/code/trieste/trieste/models/keras/models.py", line 407, in __getstate__
    state["_optimizer"] = dill.dumps(state["_optimizer"])
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 304, in dumps
    dump(obj, file, protocol, byref, fmode, recurse, **kwds)#, strictio)
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 276, in dump
    Pickler(file, protocol, **_kwds).dump(obj)
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 498, in dump
    StockPickler.dump(self, obj)
  File "/usr/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/usr/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 885, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 885, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.7/pickle.py", line 819, in save_list
    self._batch_appends(obj)
  File "/usr/lib/python3.7/pickle.py", line 843, in _batch_appends
    save(x)
  File "/usr/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 885, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 890, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/usr/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.7/pickle.py", line 885, in _batch_setitems
    save(v)
  File "/usr/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1015, in __reduce__
    return convert_to_tensor, (self._numpy(),)
  File "/home/hrvoje.stojic/.virtualenvs/trieste/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1062, in _numpy
    six.raise_from(core._status_to_exception(e.code, e.message), None)  # pylint: disable=protected-access
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/hrvoje.stojic/code/trieste/trieste/bayesian_optimizer.py", line 668, in optimize
    ) from e
NotImplementedError: Failed to save the optimization state. Some models do not support deecopying or serialization and cannot be saved. (This is particularly common for deep neural network models, though some of the model wrappers accept a model closure as a workaround.) For these models, the `track_state`` argument of the :meth:`~trieste.bayesian_optimizer.BayesianOptimizer.optimize` method should be set to `False`. This means that only the final model will be available.

Terminating optimization and returning the optimization history. You may be able to use the history to restart the process from a previous successful optimization step.