The correct way to run a converted model in C++

Question

The correct way to run a converted model in C++

yurkovak opened this issue 4 years ago · 4 comments

Hi! I found this comment promising an example of using TF-TRT in C++, but unfortunately it doesn't seem to exist just yet. It's much needed and I'm looking forward to it, thanks in advance!

In the meantime, I was able to proceed, but stumbled with the crash described below. Would be happy if someone could point out what I'm doing wrong.

OS: Ubuntu 16.04
Python: 3.7.0
Tensorflow python: tensorflow-gpu==1.15.0 from pip
Tensorflow C++: tag v1.12.0-rc2 compiled from source with GPU & TensorRT support and /usr/bin/python

Converting the model SavedModel <-> SavedModel formats:

from tensorflow.python.compiler.tensorrt import trt_convert as trt


converter = trt.TrtGraphConverter(
        input_saved_model_dir='/path/to/frozen/SavedModel/folder',
        nodes_blacklist=outputs,
        max_batch_size=2,
        max_workspace_size_bytes=2*pow(10, 9),     #2 Gb 
        precision_mode='FP32',
        minimum_segment_size=3,
        is_dynamic_op=False
    )
trt_graph = converter.convert()
converter.save('/path/for/converted/SavedModel/folder')

Converts successfully, creates a few TRTEngineOp nodes, saved_model_cli shows adequate output.

Loading in C++:

status_lib = TF_NewStatus();
TF_LoadLibrary("/path/to/_trt_engine_op.so", status_lib);

std::unique_ptr<tensorflow::SavedModelBundle> bundle;
tensorflow::SessionOptions session_options;
tensorflow::RunOptions run_options;
session_options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(mem_frac);
session_options.config.mutable_gpu_options()->set_allow_growth(true);
bundle.reset(new tensorflow::SavedModelBundle);
auto status = tensorflow::LoadSavedModel(session_options, run_options, "/path/for/converted/SavedModel/folder", {"serve"}, bundle.get());

Dies with the following farewell:

No OpKernel was registered to support Op 'TRTEngineOp' with these attrs.  Registered devices:     [CPU,GPU,XLA_CPU,XLA_GPU], Registered kernels:
  <no registered kernels>

	 [[{{node TRTEngineOp_0}} = TRTEngineOp[InT=[DT_FLOAT], OutT=[DT_FLOAT], _output_shapes=[<unknown>], cached_engine_batches=[], calibration_data="", fixed_input_size=true, input_shapes=[], max_cached_engines_count=1, output_shapes=[], precision_mode="FP32", segment_func=TRTEngineOp_0_native_segment[], segment_funcdef_name="", serialized_segment="\030\017\0...00\000\000", static_engine=true, use_calibration=false, workspace_size_bytes=11075949] 
(rgb_to_grayscale/Tensordot)]]

I am able to run successfully:

the original model from '/path/for/converted/SavedModel/folder' with this C++ code
the converted '/path/for/converted/SavedModel/folder' in Python

Answer 1 · 2020-09-10T15:31:39.000Z

I was able to get rid of this error using this answer, but now I get:

Invalid argument: NodeDef mentions attr 'segment_func' not in Op<name=TRTEngineOp
...
(Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).

I believe it's due to 1.15.0 vs 1.12.0-rc2 tensorflow version mismatch. Trying to build a wheel, cause looks like 1.12 isn't available in pip anymore. Will update later.

EDITED: I can confirm that after using the same version for serving a model and for inference everything works, so I'm closing this issue.

Answer 2 · 2022-07-21T11:04:04.000Z

@yurkovak
Hello, have you successfully run the tftrt model in C++?

Answer 3 · 2022-07-21T18:54:38.000Z

CC: @meena-at-work

@YouSenRong We are currently building benchmarks in C++. Please have a look here: https://github.com/tensorflow/tensorrt/tree/master/tftrt/benchmarking-cpp

Answer 4 · 2022-07-28T07:03:59.000Z

@DEKHTIARJonathan Thanks a lot