tensorflow/tensorrt

OP_REQUIRES failed at partitioned_ops, but tensorRT model can be loaded

luvwinnie opened this issue · 3 comments

I'm using a TensorFlow 2.0 stable version and converted to saved_model, and then use the following code to convert to tensorrt.

from tensorflow.python.compiler.tensorrt import trt_convert as trt
params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
        precision_mode=trt.TrtPrecisionMode.FP16, is_dynamic_op=True)
converter = trt.TrtGraphConverterV2(input_saved_model_dir="model/01",conversion_params=params)
converter.convert()
converter.save("model_optimized/01")

When I loaded to tensorflow_model_server, the model loaded without errors. However, when I try to inference, it shows below errors from the model server as the response.

{ "error": "[_Derived_]{{function_node __inference_signature_wrapper_21469}} Op type not registered \'TRTEngineOp\' in binary running on PC-1. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.\n\t [[{{node PartitionedCall}}]]\n\t [[PartitionedCall]]" }

How can I solve this? I used the layers which are supported by tensorRT.
Below is my tensorRT converted logs.

2019-10-29 02:13:09.541971: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.5
2019-10-29 02:13:09.671300: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-10-29 02:13:09.775980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:17:00.0
2019-10-29 02:13:09.778019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:65:00.0
2019-10-29 02:13:09.778044: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-10-29 02:13:09.778057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-10-29 02:13:09.778868: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-10-29 02:13:09.779103: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-10-29 02:13:09.780128: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-10-29 02:13:09.780909: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-10-29 02:13:09.780941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-10-29 02:13:09.788168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2019-10-29 02:13:09.788377: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-10-29 02:13:09.802004: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz
2019-10-29 02:13:09.804680: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x70f3820 executing computations on platform Host. Devices:
2019-10-29 02:13:09.804721: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-10-29 02:13:10.071785: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7155b40 executing computations on platform CUDA. Devices:
2019-10-29 02:13:10.071838: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): TITAN RTX, Compute Capability 7.5
2019-10-29 02:13:10.071863: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (1): TITAN RTX, Compute Capability 7.5
2019-10-29 02:13:10.074425: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:17:00.0
2019-10-29 02:13:10.076628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:65:00.0
2019-10-29 02:13:10.076686: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-10-29 02:13:10.076712: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-10-29 02:13:10.076737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-10-29 02:13:10.076761: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-10-29 02:13:10.076788: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-10-29 02:13:10.076819: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-10-29 02:13:10.076842: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-10-29 02:13:10.085324: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2019-10-29 02:13:10.085387: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-10-29 02:13:10.418393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-29 02:13:10.418424: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 1
2019-10-29 02:13:10.418429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N Y
2019-10-29 02:13:10.418432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 1:   Y N
2019-10-29 02:13:10.421686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22752 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:17:00.0, compute capability: 7.5)
2019-10-29 02:13:10.422939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22752 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:65:00.0, compute capability: 7.5)
2019-10-29 02:13:14.255005: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 2
2019-10-29 02:13:14.255084: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2019-10-29 02:13:14.256101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:17:00.0
2019-10-29 02:13:14.256908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:65:00.0
2019-10-29 02:13:14.256934: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-10-29 02:13:14.256941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-10-29 02:13:14.256950: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-10-29 02:13:14.256957: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-10-29 02:13:14.256965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-10-29 02:13:14.256971: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-10-29 02:13:14.256978: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-10-29 02:13:14.259429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2019-10-29 02:13:14.259453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-29 02:13:14.259458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 1
2019-10-29 02:13:14.259462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N Y
2019-10-29 02:13:14.259465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 1:   Y N
2019-10-29 02:13:14.261675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22752 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:17:00.0, compute capability: 7.5)
2019-10-29 02:13:14.262492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22752 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:65:00.0, compute capability: 7.5)
2019-10-29 02:13:14.297527: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: graph_to_optimize
2019-10-29 02:13:14.297546: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   function_optimizer: Graph size after: 699 nodes (542), 1363 edges (1206), time = 16.169ms.
2019-10-29 02:13:14.297550: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   function_optimizer: function_optimizer did nothing. time = 0.189ms.
2019-10-29 02:13:26.859686: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 2
2019-10-29 02:13:26.859925: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2019-10-29 02:13:26.860946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:17:00.0
2019-10-29 02:13:26.861732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties:
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:65:00.0
2019-10-29 02:13:26.861757: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-10-29 02:13:26.861765: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-10-29 02:13:26.861774: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-10-29 02:13:26.861781: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-10-29 02:13:26.861788: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-10-29 02:13:26.861794: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-10-29 02:13:26.861801: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-10-29 02:13:26.864224: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2019-10-29 02:13:26.864251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-29 02:13:26.864256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 1
2019-10-29 02:13:26.864260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N Y
2019-10-29 02:13:26.864263: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 1:   Y N
2019-10-29 02:13:26.866532: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22752 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:17:00.0, compute capability: 7.5)
2019-10-29 02:13:26.867357: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22752 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:65:00.0, compute capability: 7.5)
2019-10-29 02:13:34.794502: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 25 ops of 10 different types in the graph that are not converted to TensorRT: Identity, Reshape, Pack, MatMul, Placeholder, NoOp, Conv2D, Shape, DataFormatVecPermute, StridedSlice, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2019-10-29 02:13:34.811717: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:633] Number of TensorRT candidate segments: 9
2019-10-29 02:13:34.872037: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_0 added for segment 0 consisting of 5 nodes succeeded.
2019-10-29 02:13:34.872132: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_1 added for segment 1 consisting of 8 nodes succeeded.
2019-10-29 02:13:34.872186: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_2 added for segment 2 consisting of 18 nodes succeeded.
2019-10-29 02:13:34.872252: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_3 added for segment 3 consisting of 8 nodes succeeded.
2019-10-29 02:13:34.872294: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_4 added for segment 4 consisting of 9 nodes succeeded.
2019-10-29 02:13:34.872339: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_5 added for segment 5 consisting of 8 nodes succeeded.
2019-10-29 02:13:34.872390: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_6 added for segment 6 consisting of 18 nodes succeeded.
2019-10-29 02:13:34.872450: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node StatefulPartitionedCall/model_1/TRTEngineOp_7 added for segment 7 consisting of 8 nodes succeeded.
2019-10-29 02:13:34.872535: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_8 added for segment 8 consisting of 167 nodes succeeded.
2019-10-29 02:13:35.855257: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:35.871582: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:35.884292: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.057519: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.147283: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.171553: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.186575: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.201797: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.217864: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2019-10-29 02:13:36.240138: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: tf_graph
2019-10-29 02:13:36.240181: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 454 nodes (-145), 948 edges (-290), time = 2156.7439ms.
2019-10-29 02:13:36.240195: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 459 nodes (5), 953 edges (5), time = 1059.45898ms.
2019-10-29 02:13:36.240207: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 459 nodes (0), 953 edges (0), time = 1015.68ms.
2019-10-29 02:13:36.240218: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 219 nodes (-240), 301 edges (-652), time = 2466.64893ms.
2019-10-29 02:13:36.240228: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 219 nodes (0), 301 edges (0), time = 920.531ms.
2019-10-29 02:13:36.240239: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_4_native_segment
2019-10-29 02:13:36.240250: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 13 nodes (0), 12 edges (0), time = 0.929ms.
2019-10-29 02:13:36.240263: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 13 nodes (0), 12 edges (0), time = 0.368ms.
2019-10-29 02:13:36.240276: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 13 nodes (0), 12 edges (0), time = 0.975ms.
2019-10-29 02:13:36.240288: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 13 nodes (0), 12 edges (0), time = 0.064ms.
2019-10-29 02:13:36.240300: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 13 nodes (0), 12 edges (0), time = 0.767ms.
2019-10-29 02:13:36.240312: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_2_native_segment
2019-10-29 02:13:36.240331: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 22 nodes (0), 21 edges (0), time = 2.278ms.
2019-10-29 02:13:36.240349: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 22 nodes (0), 21 edges (0), time = 1.958ms.
2019-10-29 02:13:36.240367: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 22 nodes (0), 21 edges (0), time = 2.368ms.
2019-10-29 02:13:36.240384: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 22 nodes (0), 21 edges (0), time = 0.247ms.
2019-10-29 02:13:36.240397: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 22 nodes (0), 21 edges (0), time = 2.315ms.
2019-10-29 02:13:36.240408: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_3_native_segment
2019-10-29 02:13:36.240420: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 0.704ms.
2019-10-29 02:13:36.240432: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 10 nodes (0), 9 edges (0), time = 0.23ms.
2019-10-29 02:13:36.240444: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 0.71ms.
2019-10-29 02:13:36.240456: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 10 nodes (0), 9 edges (0), time = 0.045ms.
2019-10-29 02:13:36.240468: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 0.688ms.
2019-10-29 02:13:36.240479: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: TRTEngineOp_8_native_segment
2019-10-29 02:13:36.240492: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 170 nodes (0), 175 edges (0), time = 37.975ms.
2019-10-29 02:13:36.240504: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 170 nodes (0), 175 edges (0), time = 41.919ms.
2019-10-29 02:13:36.240515: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 170 nodes (0), 175 edges (0), time = 38.449ms.
2019-10-29 02:13:36.240528: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 170 nodes (0), 175 edges (0), time = 5.369ms.
2019-10-29 02:13:36.240539: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 170 nodes (0), 175 edges (0), time = 38.153ms.
2019-10-29 02:13:36.240551: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_6_native_segment
2019-10-29 02:13:36.240563: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 22 nodes (0), 21 edges (0), time = 6.668ms.
2019-10-29 02:13:36.240575: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 22 nodes (0), 21 edges (0), time = 6.469ms.
2019-10-29 02:13:36.240587: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 22 nodes (0), 21 edges (0), time = 6.67ms.
2019-10-29 02:13:36.240599: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 22 nodes (0), 21 edges (0), time = 0.821ms.
2019-10-29 02:13:36.240610: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 22 nodes (0), 21 edges (0), time = 6.681ms.
2019-10-29 02:13:36.240622: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_7_native_segment
2019-10-29 02:13:36.240634: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.919ms.
2019-10-29 02:13:36.240646: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 10 nodes (0), 9 edges (0), time = 0.719ms.
2019-10-29 02:13:36.240658: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.906ms.
2019-10-29 02:13:36.240670: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 10 nodes (0), 9 edges (0), time = 0.105ms.
2019-10-29 02:13:36.240681: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.658ms.
2019-10-29 02:13:36.240693: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_0_native_segment
2019-10-29 02:13:36.240705: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 9 nodes (0), 8 edges (0), time = 1.766ms.
2019-10-29 02:13:36.240717: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 9 nodes (0), 8 edges (0), time = 0.565ms.
2019-10-29 02:13:36.240728: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 9 nodes (0), 8 edges (0), time = 1.791ms.
2019-10-29 02:13:36.240740: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 9 nodes (0), 8 edges (0), time = 0.084ms.
2019-10-29 02:13:36.240752: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 9 nodes (0), 8 edges (0), time = 1.903ms.
2019-10-29 02:13:36.240764: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_1_native_segment
2019-10-29 02:13:36.240776: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.883ms.
2019-10-29 02:13:36.240788: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 10 nodes (0), 9 edges (0), time = 0.661ms.
2019-10-29 02:13:36.240800: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.949ms.
2019-10-29 02:13:36.240811: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 10 nodes (0), 9 edges (0), time = 0.1ms.
2019-10-29 02:13:36.240823: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.861ms.
2019-10-29 02:13:36.240835: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: StatefulPartitionedCall/model_1/TRTEngineOp_5_native_segment
2019-10-29 02:13:36.240847: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.903ms.
2019-10-29 02:13:36.240859: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   layout: Graph size after: 10 nodes (0), 9 edges (0), time = 0.691ms.
2019-10-29 02:13:36.240871: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.934ms.
2019-10-29 02:13:36.240883: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   TensorRTOptimizer: Graph size after: 10 nodes (0), 9 edges (0), time = 0.101ms.
2019-10-29 02:13:36.240894: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718]   constant folding: Graph size after: 10 nodes (0), 9 edges (0), time = 1.896ms.
2019-10-29 02:13:47.882826: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_8)
2019-10-29 02:13:47.882958: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_1)
2019-10-29 02:13:47.883013: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_2)
2019-10-29 02:13:47.883060: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_3)
2019-10-29 02:13:47.883104: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_4)
2019-10-29 02:13:47.883146: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_5)
2019-10-29 02:13:47.883188: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_6)
2019-10-29 02:13:47.883230: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_7)
2019-10-29 02:13:47.883271: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at trt_engine_resource_ops.cc:183 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_0)
WARNING:tensorflow:From /misc/home/usr16/cheesiang_leow/.virtualenvs/tensorflow-2.0/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1781: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

@luvwinnie could you provide a complete reproduction, especially instructions on how you load the model to tensorflow_model_server and do inference?

@aaroey I used the docker tensorflow/serving and just load the model. Really just a simple command like
below docker command.

MODEL_PATH=[absoblute path to model]
docker run -t --rm -p 8501:8501 \
    -v "$MODEL_PATH/model_optimized:/models/model_optimized" \
    -e MODEL_NAME=model_name \
    tensorflow/serving &

Then i used the following client to request the REST API.

for path in paths:
    data = cv2.imread(path)
    data = data / 127.5 - 1
    data,data_width = preprocessing(data)
    all_data = json.dumps(
        {"signature_name": "serving_default", "instances": data[np.newaxis,...].tolist()}
    )
    headers = {"content-type": "application/json"}
    start = timer()
    json_response = requests.post(
        "http://localhost:8501/v1/models/model_optimized:predict", data=all_data, headers=headers
    )
    end = timer()
    print("total elapsed time:", end - start)
    print(json_response)
    print(json_response.text)
    predictions = json.loads(json_response.text)["predictions"]

Errors

total elapsed time: 9.912729174131528
<Response [404]>
{ "error": "[_Derived_]{{function_node __inference_signature_wrapper_21469}} Op type not registered \'TRTEngineOp\' in binary running on pc-01. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.\n\t [[{{node PartitionedCall}}]]\n\t [[PartitionedCall]]" }

Note: The model before using the tensorrt to optimized is working properly.

@luvwinnie could you try with tensorflow/serving:latest-gpu?