TF-Keras mixed precision training leads to autograph errors

Question

TF-Keras mixed precision training leads to autograph errors

Closed this issue a month ago · 4 comments

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux / Colab
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.16.1, 2.17.0.dev20240423
Python version: 3.10, 3.11
Bazel version (if compiling from source): -
GPU model and memory: NVIDIA Tesla T4
Exact command to reproduce: see colab notebooks below

Describe the problem.

Since TF version 2.16 mixed precision training is fails to compile with autograph and throws warnings when running a minimal mixed precision training example:

import os

os.environ["TF_USE_LEGACY_KERAS"] = "1"

import tensorflow as tf
from tensorflow import keras

keras.mixed_precision.set_global_policy('mixed_float16')
inputs = keras.Input(shape=(784,))
x = keras.layers.Dense(10)(inputs)
outputs = keras.layers.Activation('softmax', dtype='float32')(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='sparse_categorical_crossentropy', optimizer=keras.optimizers.RMSprop())

(x_train, y_train), _ = keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255

_ = model.fit(x_train, y_train, batch_size=128, epochs=1, steps_per_epoch=1, verbose=0)

Describe the current behavior.

TF Autograph doesn't transform the create_autocast_variable function and throws the following warnings:

WARNING:tensorflow:AutoGraph could not transform <function create_autocast_variable at 0x7a263a673400> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: <gast.gast.Expr object at 0x7a25b0be2e00>
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

Describe the expected behavior.

Autograph should be able to transform the function and should not throw any warnings like it did in TF 2.15.
Unfortunately we're currently unable to upgrade to Keras 3 due to other issues, so it would be good to be able to get this patched in TF-Keras as well.

@reedwm @fchollet do you know what could cause this issue? I'm happy to move this issue to TF if you think this is an issue with Autograph itself.

Standalone code to reproduce the issue.

See the following notebooks: This wasn't an issue in TF 2.15, but fails in TF 2.16 and still fails in TF Nightly.

Answer 1 · 2024-04-26T10:31:59.000Z

I was able to reproduce the issue on tensorflow v2.16, tf-nightly whereas with the tensorflow v2.15 the same code was executed without any warnings.

WARNING: AutoGraph could not transform <function create_autocast_variable at 0x79fe61a00310> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: <gast.gast.Expr object at 0x79fe0b9ac850>

Kindly find the gist of it here. Thank you!

Answer 2 · 2024-05-08T22:33:20.000Z

@lgeiger ,

Thanks for the bug report.

This issue was introduced when making Keras 3 the default in Tensorflow in version 2.16. The fix should be in the nightly build tomorrow.

Fabien

Answer 3 · 2024-05-10T11:04:55.000Z

Thanks for the fix! Looks like the latest nightly indeed fixes this issue 👍

Answer 4 · 2024-05-10T13:18:14.000Z

@lgeiger,
As mentioned the TF-keras mixed precision issue was resolved and the warning is also disabled with the merged PR in the nightly.
KIndly find the gist for the reference. Thank you!