"Variable is not running on full-size initialization" warning when using AdagradOptimizer

Question

"Variable is not running on full-size initialization" warning when using AdagradOptimizer

dzfish opened this issue a year ago · 3 comments

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Ubuntu 20.04
TensorFlow version and how it was installed (source or binary):
TF2.8 installed by pip
TensorFlow-Recommenders-Addons version and how it was installed (source or binary):
0.6 source
Python version:
3.8
Is GPU used? (yes/no):
no
Describe the bug

When I replace the AdamOptimizer with AdagradOptimizer

I will receive warning at the start of the training like

tensorflow::Variable [xxx_dynamic_embeddings/Adagrad/accumulator] is not running on full-size initialization: Cannot convert a symbolic Tensor （accumulator/accumulator/Read/strided_slice_1:0）to a numpy.

I wonder if this has any bad effect on my training

Code to reproduce the issue

In demo/dynamic_embedding/movielens-100k-estimator/movielens-100k-estimator.py

   if mode == tf.estimator.ModeKeys.TRAIN:
-    optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
+    optimizer = tf.compat.v1.train.AdagradOptimizer(learning_rate=0.001)

Answer 1 · 2023-10-18T15:19:50.000Z

It seems that need a patch like __call__for_keras_init_v2 function in tensorflow_recommenders_addons/dynamic_embedding/python/ops/tf_patch.py
@rhdong

Answer 2 · 2023-12-14T17:37:49.000Z

@dzfish If it is convenient for you, you can refer to the content of tensorflow_recommenders_addons/dynamic_embedding/python/ops/tf_patch.py to make code contributions. Please look at __call__for_keras_init_v1 function.

Answer 3 · 2023-12-14T21:43:12.000Z

Hi @dzfish, thank you for your feedback. It means some dynamic variable(which can be trainable features or the slot(s) of any optimizers) was initialized with the same initial vector in one iteration. According to our experience, it usually has no negative effect but is only partially consistent with the original intention of the initializer algorithm. Which is often caused by eager mode or some special initializer that does not support full-size initialization. Your case should be related to eager mode. If possible, please try to use graph mode.