google/qhbm-library

Performance issue in baselines/train.py

DLPerf opened this issue · 1 comments

Hello! Our static bug checker has found a performance issue in baselines/train.py: train_step is repeatedly called in a for loop, but there is a tf.function decorated function train_inner_step defined and called in train_step.

In that case, when train_step is called in a loop, the function train_inner_step will create a new graph every time, and that can trigger tf.function retracing warning.

Similarly,
train_inner_step is defined in train_model and the outer function is repeatedly called here and here.

Here is the tensorflow document to support it.

Briefly, for better efficiency, it's better to use:

@tf.function
def inner():
    pass

def outer():
    inner()  

than:

def outer():
    @tf.function
    def inner():
        pass
    inner()

Looking forward to your reply.

But there are some variables in the inner function depending on the outer function, code may be more complex if changes are made. Is it necessary? Do you have any idea?