convert_model.mnist_eg.tf errors, possibly due to different TF versions
theta-lin opened this issue · 1 comments
> python3 --version
Python 3.11.8
> pip list | grep tensorflow
tensorflow 2.16.1
I encountered errors executing python3 -m convert_model.mnist_eg.tf
. Since the version of tensorflow
is unspecified in convert_model/requirements.txt
, perhaps tensorflow works differently in my version. (The version of coremltools
is also unspecified, which might also be problematic)
2024-04-17 15:25:44.036395: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:25:44.040311: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:25:44.091677: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-17 15:25:46.018733: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-17 15:25:47.542628: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 15:25:47.544149: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 32, in <module>
tflite()
File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 25, in tflite
save_model(model, SAVED_MODEL_DIR)
File "/data/dku/fedcampus/FedKit/convert_model/tflite.py", line 66, in save_model
parameters = model.parameters.get_concrete_function()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1251, in get_concrete_function
concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1221, in _get_concrete_function_garbage_collected
self._initialize(args, kwargs, add_initializers_to=initializers)
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 696, in _initialize
self._concrete_variable_creation_fn = tracing_compilation.trace_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py", line 178, in trace_function
concrete_function = _maybe_define_function(
^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py", line 283, in _maybe_define_function
concrete_function = _create_concrete_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py", line 310, in _create_concrete_function
traced_func_graph = func_graph_module.func_graph_from_py_func(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py", line 1059, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 599, in wrapped_fn
out = weak_wrapped_fn().__wrapped__(*args, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1719, in bound_method_wrapper
return wrapped_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py", line 52, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py", line 41, in autograph_handler
return api.converted_call(
^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
result = converted_f(*effective_args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/__autograph_generated_file__ud6qml.py", line 12, in tf__parameters
retval_ = {f'a{ag__.ld(index)}': ag__.converted_call(ag__.ld(weight).read_value, (), None, fscope) for index, weight in ag__.converted_call(ag__.ld(enumerate), (ag__.ld(self).model.weights,), None, fscope)}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/__autograph_generated_file__ud6qml.py", line 12, in <dictcomp>
retval_ = {f'a{ag__.ld(index)}': ag__.converted_call(ag__.ld(weight).read_value, (), None, fscope) for index, weight in ag__.converted_call(ag__.ld(enumerate), (ag__.ld(self).model.weights,), None, fscope)}
^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: in user code:
File "/data/dku/fedcampus/FedKit/convert_model/tflite.py", line 31, in parameters *
f"a{index}": weight.read_value()
AttributeError: 'Variable' object has no attribute 'read_value'
After changing
FedKit/convert_model/tflite.py
Line 31 in e203312
to
f"a{index}": weight.value.read_value()
this error is resolved, but I then encountered another error
2024-04-17 15:54:33.916521: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:54:33.920357: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:54:33.969224: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-17 15:54:35.695073: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-17 15:54:37.321051: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 15:54:37.322647: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 32, in <module>
tflite()
File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 25, in tflite
save_model(model, SAVED_MODEL_DIR)
File "/data/dku/fedcampus/FedKit/convert_model/tflite.py", line 71, in save_model
tf.saved_model.save(
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1392, in save
save_and_return_nodes(obj, export_dir, signatures, options)
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1427, in save_and_return_nodes
_build_meta_graph(obj, signatures, options, meta_graph_def))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1642, in _build_meta_graph
return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1566, in _build_meta_graph_impl
asset_info, exported_graph = _fill_meta_graph_def(
^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 933, in _fill_meta_graph_def
signatures = _generate_signatures(signature_functions, object_map, defaults)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 655, in _generate_signatures
outputs = object_map[function](**{
^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/saved_model_exported_concrete.py", line 45, in __call__
export_captures = _map_captures_to_created_tensors(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/saved_model_exported_concrete.py", line 74, in _map_captures_to_created_tensors
_raise_untracked_capture_error(function.name, exterior, interior)
File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/saved_model_exported_concrete.py", line 98, in _raise_untracked_capture_error
raise AssertionError(msg)
AssertionError: Tried to export a function which references an 'untracked' resource. TensorFlow objects (e.g. tf.Variable) captured by functions must be 'tracked' by assigning them to an attribute of a tracked object or assigned to an attribute of the main object directly. See the information below:
Function name = b'__inference_signature_wrapper_1685'
Captured Tensor = <ResourceHandle(name="loss/total/10", device="/job:localhost/replica:0/task:0/device:CPU:0", container="Anonymous", type="tensorflow::Var", dtype and shapes : "[ DType enum: 1, Shape: [] ]")>
Trackable referencing this tensor = <tf.Variable 'loss/total:0' shape=() dtype=float32>
Internal Tensor = Tensor("1637:0", shape=(), dtype=resource)
8 restore test results.
According to answers such as https://stackoverflow.com/questions/73416907/model-save-tried-to-export-a-function-which-references-untracked-resource-eve, the use of static members might cause this problem, but I don't know how to fix it in this project's case.
Did reproduce. Unfortunately, I don't know how to solve this.
Ideas:
- Try with the version here: https://github.com/adap/flower/blob/main/examples/android-kotlin/gen_tflite/pyproject.toml
- Check out TFLite's latest on-device training example and compare it to the 2022 one (which this converter is based on).