Performance issue in baselines/train.py
DLPerf opened this issue · 1 comments
Hello! Our static bug checker has found a performance issue in baselines/train.py: train_step
is repeatedly called in a for loop, but there is a tf.function decorated function train_inner_step
defined and called in train_step
.
In that case, when train_step
is called in a loop, the function train_inner_step
will create a new graph every time, and that can trigger tf.function retracing warning.
Similarly,
train_inner_step
is defined in train_model
and the outer function is repeatedly called here and here.
Here is the tensorflow document to support it.
Briefly, for better efficiency, it's better to use:
@tf.function
def inner():
pass
def outer():
inner()
than:
def outer():
@tf.function
def inner():
pass
inner()
Looking forward to your reply.
But there are some variables in the inner function depending on the outer function, code may be more complex if changes are made. Is it necessary? Do you have any idea?