tensorflow/tensorrt

The correct way to run a converted model in C++

yurkovak opened this issue · 4 comments

Hi! I found this comment promising an example of using TF-TRT in C++, but unfortunately it doesn't seem to exist just yet. It's much needed and I'm looking forward to it, thanks in advance!

In the meantime, I was able to proceed, but stumbled with the crash described below. Would be happy if someone could point out what I'm doing wrong.

  • OS: Ubuntu 16.04
  • Python: 3.7.0
  • Tensorflow python: tensorflow-gpu==1.15.0 from pip
  • Tensorflow C++: tag v1.12.0-rc2 compiled from source with GPU & TensorRT support and /usr/bin/python
  1. Converting the model SavedModel <-> SavedModel formats:
    from tensorflow.python.compiler.tensorrt import trt_convert as trt
    
    
    converter = trt.TrtGraphConverter(
            input_saved_model_dir='/path/to/frozen/SavedModel/folder',
            nodes_blacklist=outputs,
            max_batch_size=2,
            max_workspace_size_bytes=2*pow(10, 9),     #2 Gb 
            precision_mode='FP32',
            minimum_segment_size=3,
            is_dynamic_op=False
        )
    trt_graph = converter.convert()
    converter.save('/path/for/converted/SavedModel/folder')
    Converts successfully, creates a few TRTEngineOp nodes, saved_model_cli shows adequate output.
  2. Loading in C++:
    status_lib = TF_NewStatus();
    TF_LoadLibrary("/path/to/_trt_engine_op.so", status_lib);
    
    std::unique_ptr<tensorflow::SavedModelBundle> bundle;
    tensorflow::SessionOptions session_options;
    tensorflow::RunOptions run_options;
    session_options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(mem_frac);
    session_options.config.mutable_gpu_options()->set_allow_growth(true);
    bundle.reset(new tensorflow::SavedModelBundle);
    auto status = tensorflow::LoadSavedModel(session_options, run_options, "/path/for/converted/SavedModel/folder", {"serve"}, bundle.get());
    Dies with the following farewell:
    No OpKernel was registered to support Op 'TRTEngineOp' with these attrs.  Registered devices:     [CPU,GPU,XLA_CPU,XLA_GPU], Registered kernels:
      <no registered kernels>
    
    	 [[{{node TRTEngineOp_0}} = TRTEngineOp[InT=[DT_FLOAT], OutT=[DT_FLOAT], _output_shapes=[<unknown>], cached_engine_batches=[], calibration_data="", fixed_input_size=true, input_shapes=[], max_cached_engines_count=1, output_shapes=[], precision_mode="FP32", segment_func=TRTEngineOp_0_native_segment[], segment_funcdef_name="", serialized_segment="\030\017\0...00\000\000", static_engine=true, use_calibration=false, workspace_size_bytes=11075949] 
    (rgb_to_grayscale/Tensordot)]]
    

I am able to run successfully:

  • the original model from '/path/for/converted/SavedModel/folder' with this C++ code
  • the converted '/path/for/converted/SavedModel/folder' in Python

I was able to get rid of this error using this answer, but now I get:

Invalid argument: NodeDef mentions attr 'segment_func' not in Op<name=TRTEngineOp
...
(Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).

I believe it's due to 1.15.0 vs 1.12.0-rc2 tensorflow version mismatch. Trying to build a wheel, cause looks like 1.12 isn't available in pip anymore. Will update later.

EDITED: I can confirm that after using the same version for serving a model and for inference everything works, so I'm closing this issue.

@yurkovak
Hello, have you successfully run the tftrt model in C++?

CC: @meena-at-work

@YouSenRong We are currently building benchmarks in C++. Please have a look here: https://github.com/tensorflow/tensorrt/tree/master/tftrt/benchmarking-cpp