SavedModel to TensorRT converter fails if the model uses lookup tables
Closed this issue · 6 comments
Description
The TF-TRT converter fails to save the model if it uses a lookup table. This colab illustrates how I create a simple model that uses a Keras IndexLookup
layer that in turn makes the savedmodel contain a DT_RESOURCE tensor with the lookup table contents. The last cell of the notebook that does the conversion to TensorRT fails because TensorRT is not currently supported on Colab. I use the NVIDIA Tensorflow container to actually convert the model from savedmodel to tensorrt, see "steps to reproduce" below.
Does TF-TRT support lookup tables? If not, are there known workarounds?
I tried to search for related bug reports or changes, and found that older versions of TF failed with a different error (see e.g. tensorflow/tensorflow#42673). This commit seems to have changed the error to the one I'm reporting.
I also attempted to convert to ONNX but hit a similar problem, found an existing open bug report and attached a colab reproducing the error there, see onnx/tensorflow-onnx#1867.
Environment
TensorRT Version: 8.2.5-1+cuda11.4
NVIDIA GPU: A10G
NVIDIA Driver Version: 470.57.02 (in CUDA Forward Compatibility mode "Using CUDA 11.7 driver version 515.48.08 with kernel driver version 470.57.02")
CUDA Version: 11.7
Operating System: Ubuntu 20.04
Python Version (if applicable): 3.8.10
Tensorflow Version (if applicable): 2.9.1
Baremetal or Container (if so, version): nvcr.io/nvidia/tensorflow:22.06-tf2-py3
Steps To Reproduce
-
Train a model using the code from the colab notebook, store the model to
/models/model
. -
Start the NVIDIA Tensorflow container with:
docker run --gpus all -it -v/models:/models --rm nvcr.io/nvidia/tensorflow:22.06-tf2-py3
-
Start the Python REPL and run the following script:
from tensorflow.python.compiler.tensorrt import trt_convert as trt
conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS
converter = trt.TrtGraphConverterV2(input_saved_model_dir="/models/model", conversion_params=conversion_params)
converter.convert()
converter.save("/models/output")
The error traceback:
Traceback (most recent call last):
File "/models/convert_to_tensorrt.py", line 7, in <module>
converter.save("/models/output")
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/compiler/tensorrt/trt_convert.py", line 1510, in save
save.save(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1290, in save
save_and_return_nodes(obj, export_dir, signatures, options)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1325, in save_and_return_nodes
_build_meta_graph(obj, signatures, options, meta_graph_def))
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1491, in _build_meta_graph
return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1437, in _build_meta_graph_impl
signature_serialization.canonicalize_signatures(signatures))
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/signature_serialization.py", line 180, in canonicalize_signatures
final_concrete = signature_wrapper._get_concrete_function_garbage_collected( # pylint: disable=protected-access
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 1219, in _get_concrete_function_garbage_collected
self._initialize(args, kwargs, add_initializers_to=initializers)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 785, in _initialize
self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 2480, in _get_concrete_function_internal_garbage_collected
graph_function, _ = self._maybe_define_function(args, kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 2711, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 2627, in _create_graph_function
func_graph_module.func_graph_from_py_func(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/func_graph.py", line 1141, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 677, in wrapped_fn
out = weak_wrapped_fn().__wrapped__(*args, **kwds)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/func_graph.py", line 1127, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/func_graph.py", line 1116, in autograph_handler
return autograph.converted_call(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
result = converted_f(*effective_args, **kwargs)
File "/tmp/__autograph_generated_filelbyy4owv.py", line 12, in tf__signature_wrapper
structured_outputs = ag__.converted_call(ag__.ld(signature_function), (), dict(**ag__.ld(kwargs)), fscope)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 377, in converted_call
return _call_unconverted(f, args, kwargs, options)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 458, in _call_unconverted
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 1602, in __call__
return self._call_impl(args, kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/wrap_function.py", line 243, in _call_impl
return super(WrappedFunction, self)._call_impl(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 1620, in _call_impl
return self._call_with_flat_signature(args, kwargs, cancellation_manager)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 1652, in _call_with_flat_signature
raise TypeError(f"{self._flat_signature_summary()} missing required "
TypeError: in user code:
TypeError: pruned(age, cabin, unknown) missing required arguments: unknown.
I'm not an expert on TF-TRT, but TRT doesn't support IndexLookup layer, onnx doesn't support this layer too?
Also from your log, I don't think TensorRT gets involved here. so it would be better to ask in the Tensorflow repo.
I'm not an expert on TF-TRT, but TRT doesn't support IndexLookup layer, onnx doesn't support this layer too?
When I hit the problem I suspected that neither of the converters support IndexLookup (or rather TF hash tables) but I didn't find any evidence of that. Should this be documented somewhere or explained in the error messages?
Also from your log, I don't think TensorRT gets involved here. so it would be better to ask in the Tensorflow repo.
Good point, somehow I thought the converter (that produces the model that can't be saved) is the scope of TensorRT itself, but you're right it's all in the TF repo so I should've filed this bug there.
So there are several issues in both the TF and TensorRT repos that mention this problem (without anybody clearly confirming that TensorRT or TF-TRT do not support lookup tables indeed). This is the list of issues that cross reference each other (I'm linking to particular comments that say "it seems like" lookup tables are not supported):
tensorflow/tensorflow#46254 (comment)
tensorflow/tensorrt#233 (comment)
tensorflow/text#486 (comment)
tensorflow/tensorrt#233 (comment)
The last comment from @bixia1 said that was fixed, and she linked the fix I mentioned in the description, which didn't actually fix the problem.
I don't think I should file yet another issue like this in the TF repo.
Able to reproduce this in TF2.9 (using nvidia/tensorflow:22.08-tf2-py3
)
I will closing this since no activity for a long time, also FYI, there are more TF-TRT experts for this issue in https://github.com/tensorflow/tensorrt/issues, thanks!