Tensorflow version compability

Question

Tensorflow version compability

bransom960 opened this issue 4 years ago · 2 comments

I had been successfully using your great package for months, but now it seems a dependency is off, and I can't seem to find how to get back on track.

I followed your molecule example notebook as inspiration, and as stated, at some point this worked. Because it happened so suddenly, I'm not sure if others are seeing any dependency mismatches.

The meat of the code:
model = MEGNetModel(graph_converter = MolecularGraph(),centers = gaussian_centers,width = gaussian_width,
nfeat_node=27,nfeat_edge=27,nfeat_global=len(state_attributes[0]))
I use the graph_converter=Molecular_graph() to convert structures into graphs and then...
model.train_from_graphs(train_graphs=graph_train, train_targets = target_train,
validation_graphs=graph_validation,validation_targets=target_validation)

The training gets through 3 epochs and then errors out.The errors are below by python and tensorflow version. Other warnings occur,but do not stop the code from running.

Using python 3.8:
tf 2.4.1 TypeError: 'NoneType' object is not callable
tf 2.2.0 Error while reading resource variable Adam/beta_2_17667 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/Adam/beta_2_17667/N10tensorflow3VarE does not exist.
[[node Adam/Cast_3/ReadVariableOp (defined at /home/bransom/Programs/anaconda3/envs/tf-8/lib/python3.8/site-packages/megnet/models/base.py:222) ]] [Op:__inference_train_function_29030]

Function call stack:
train_function

Using python3.7
with tensorflow 1.x
TypeError: Failed to convert object of type <class 'tuple'> to Tensor. Contents: (Dimension(96), 64). Consider casting elements to a supported type.
Apparently this is just a problem with tf 1.x

with tensorflow 2.x
Error while reading resource variable Adam/iter_124774 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/Adam/iter_124774/N10tensorflow3VarE does not exist.
[[node Adam/ReadVariableOp (defined at /home/bransom/Programs/anaconda3/envs/tf-2/lib/python3.7/site-packages/megnet/models/base.py:230) ]] [Op:__inference_distributed_function_134432]

Function call stack:
distributed_function

I have read online that this could possibly be fixed using tensorflow.Session, which I'm not sure where to incorporate into the megnet base, since I'm using the prebuilt models?
I have also read that you can import Adam from tf.keras.optimizers, which I have tried to do at various parts in the megnet code and have not succeeded.

Using Python 3.6:
with tensorflow 2.x, there are many modules which may have been renamed within tensorflow, that causes many import errors

Any insight would be great, I've been at this for weeks.

Answer 1 · 2021-03-23T05:14:42.000Z

It is difficult to tell what is the cause of this error. It could be that your data is not what it is supposed to be. Make sure that your data types match with the example, and your target values do not contain any NaN's. If you can run the example, then it should work for the same type of data.

Answer 2 · 2021-03-23T17:52:39.000Z

Thank you! I found it. It was something in my data structures. I was trying it with a new dataset and it appears my catch-all of infinity and NaNs missed a few, and making sure in the structure --> graph conversion that the shape of the data stays the same. Thank you!

…

On Mar 22, 2021, at 10:14 PM, Chi Chen ***@***.***> wrote: It is difficult to tell what is the cause of this error. It could be that your data is not what it is supposed to be. Make sure that your data types match with the example, and your target values do not contain any NaN's. If you can run the example, then it should work for the same type of data. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#236 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMQUK2B3TTHJ4MM2VB3RO4TTFAPVBANCNFSM4ZUNXM3A>.