The correct way to run a converted model in C++
yurkovak opened this issue · 4 comments
Hi! I found this comment promising an example of using TF-TRT in C++, but unfortunately it doesn't seem to exist just yet. It's much needed and I'm looking forward to it, thanks in advance!
In the meantime, I was able to proceed, but stumbled with the crash described below. Would be happy if someone could point out what I'm doing wrong.
- OS: Ubuntu 16.04
- Python: 3.7.0
- Tensorflow python: tensorflow-gpu==1.15.0 from pip
- Tensorflow C++: tag v1.12.0-rc2 compiled from source with GPU & TensorRT support and /usr/bin/python
- Converting the model SavedModel <-> SavedModel formats:
Converts successfully, creates a few TRTEngineOp nodes, saved_model_cli shows adequate output.
from tensorflow.python.compiler.tensorrt import trt_convert as trt converter = trt.TrtGraphConverter( input_saved_model_dir='/path/to/frozen/SavedModel/folder', nodes_blacklist=outputs, max_batch_size=2, max_workspace_size_bytes=2*pow(10, 9), #2 Gb precision_mode='FP32', minimum_segment_size=3, is_dynamic_op=False ) trt_graph = converter.convert() converter.save('/path/for/converted/SavedModel/folder')
- Loading in C++:
Dies with the following farewell:
status_lib = TF_NewStatus(); TF_LoadLibrary("/path/to/_trt_engine_op.so", status_lib); std::unique_ptr<tensorflow::SavedModelBundle> bundle; tensorflow::SessionOptions session_options; tensorflow::RunOptions run_options; session_options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(mem_frac); session_options.config.mutable_gpu_options()->set_allow_growth(true); bundle.reset(new tensorflow::SavedModelBundle); auto status = tensorflow::LoadSavedModel(session_options, run_options, "/path/for/converted/SavedModel/folder", {"serve"}, bundle.get());
No OpKernel was registered to support Op 'TRTEngineOp' with these attrs. Registered devices: [CPU,GPU,XLA_CPU,XLA_GPU], Registered kernels: <no registered kernels> [[{{node TRTEngineOp_0}} = TRTEngineOp[InT=[DT_FLOAT], OutT=[DT_FLOAT], _output_shapes=[<unknown>], cached_engine_batches=[], calibration_data="", fixed_input_size=true, input_shapes=[], max_cached_engines_count=1, output_shapes=[], precision_mode="FP32", segment_func=TRTEngineOp_0_native_segment[], segment_funcdef_name="", serialized_segment="\030\017\0...00\000\000", static_engine=true, use_calibration=false, workspace_size_bytes=11075949] (rgb_to_grayscale/Tensordot)]]
I am able to run successfully:
- the original model from '/path/for/converted/SavedModel/folder' with this C++ code
- the converted '/path/for/converted/SavedModel/folder' in Python
I was able to get rid of this error using this answer, but now I get:
Invalid argument: NodeDef mentions attr 'segment_func' not in Op<name=TRTEngineOp
...
(Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
I believe it's due to 1.15.0 vs 1.12.0-rc2 tensorflow version mismatch. Trying to build a wheel, cause looks like 1.12 isn't available in pip anymore. Will update later.
EDITED: I can confirm that after using the same version for serving a model and for inference everything works, so I'm closing this issue.
@yurkovak
Hello, have you successfully run the tftrt model in C++?
CC: @meena-at-work
@YouSenRong We are currently building benchmarks in C++. Please have a look here: https://github.com/tensorflow/tensorrt/tree/master/tftrt/benchmarking-cpp
@DEKHTIARJonathan Thanks a lot