Model tf_cnn_histology - Federation run fails with - AttributeError: 'Adam' object has no attribute 'weights'
Closed this issue · 3 comments
Describe the bug
While trying the Quick Start Guide for model tf_cnn_histology, the federation run is failing.
Last 2-3 lines from the error message:
line 334, in get_tensor_dict
opt_weights = self._get_weights_dict(self.model.optimizer, suffix)
File "/home/azureuser/openfl-quickstart/lib/python3.10/site-packages/openfl/federated/task/runner_keras.py", line 301, in _get_weights_dict
weight_names = [weight.name for weight in obj.weights]
AttributeError: 'Adam' object has no attribute 'weights'
To Reproduce
Steps to reproduce the behavior:
- Follow the steps mentioned in Quick Start replacing model torch_cnn_mnist with tf_cnn_histology
- Create workspace, certify it.
- Generate CSR request for aggregator with CA signing it.
- Initialise the plan
- Setup Collaborator1 & Collaborator2
- Run the federation.
Expected behavior
Experiment should complete without error.
Screenshots
Machine
Ubuntu 22.04
Thanks for reporting this @noopurintel. This is actually a known issue with the tf workspaces and is being tracked in #973. However, with the release of keras 3.x (which is used by tensorflow v2.16+ by default), additional changes were introduced that further break our tf workspaces. Since tensorflow is now at v2.18, the original fix is stale and will likely require much more refactoring to get it working
Edit:
A quick fix is to use the legacy Adam optimizer (which gets deprecated in later TF versions) in place of L91
self.optimizer = tf.keras.optimizers.legacy.Adam()
In fact, as a temp fix, we can update all tf workspaces with the respective legacy optimizers to get them working on tf v2.13, but a long term solution would require a refactor of the workspace
Adding duplicate
label for better tracking, but leaving it open since it directly calls out the bug. It'll be part of a bigger effort to update the workspace
Fixed in #1194.