secondmind-labs/trieste

Don't decorate acquisition function call methods with tf.function

Closed this issue · 1 comments

Trieste's current approach to avoiding retracing is to decorate the __call__ method of acquisition function classes with tf.function. However, this has two negative side effects:

  1. It results in early compilation, restricting use of lazy initialisation of variables
  2. It keeps compiled graphs in a global cache that accumulates over time (e.g. during the test runs)

To fix this we should move the use of tf.functions further out. Note that this will probably be a significant amount of work.

Following this we can also convert some variables from dynamically shaped to lazy initialised (e.g. self._eps).

So it turns out that both of the assumptions in this ticket were incorrect!

  1. It is possible to use lazy initialised variables even with the current approach (see #718, though also #748).
  2. The graphs are stored on the instances not the class method so are ostensibly freed, though it also turns out some memory is leaked by tensorflow regardless of where the graphs are stored, meaning that runs with many compilations (such as tests) inevitably leak memory. A workaround is to split the test job into multiple runs (see #743 and #749).

Given that the official tensorflow documentation includes examples like the ones in trieste, I recommend closing this issue.