tensorflow/tensorrt

No SpeedUp after TensorRT INT8

Opened this issue · 0 comments

Description

I transfer my model to tensorrt engine using tftrt in IN8. However, the speed is same as FP32 even FP16. I also change minimum_segment_size to 2, 3, 5, but it also does not help.
The speed always is the same no matter what the minimum_segment_size is or precision mode is.

I use the code to check the number of node that's being optimized or replaced with TRT nodes:

print("graph_size(MB)(native_tf): %.1f" % (float(graph_size) / (1 << 20))) print("graph_size(MB)(trt): %.1f" % (float(len(engine_graph.SerializeToString())) / (1 << 20))) print("num_nodes(native_tf): %d" % num_nodes) print("num_nodes(tftrt_total): %d" % len(engine_graph.node)) print("num_nodes(trt_only): %d" % len([1 for n in engine_graph.node if str(n.op) == 'TRTEngineOp']))

The log is shown:
graph_size(MB)(native_tf): 52.6
graph_size(MB)(trt): 52.8
num_nodes(native_tf): 4243
num_nodes(tftrt_total): 3429
num_nodes(trt_only): 76

The log in convert() function is shown as following:
2020-02-18 04:06:40.806897: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:633] Number of TensorRT candidate segments: 76
2020-02-18 04:06:41.444418: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_0 added for segment 0 consisting of 10 nodes succeeded.
2020-02-18 04:06:41.444539: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_1 added for segment 1 consisting of 19 nodes succeeded.
2020-02-18 04:06:41.444757: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_2 added for segment 2 consisting of 17 nodes succeeded.
2020-02-18 04:06:41.444921: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_3 added for segment 3 consisting of 17 nodes succeeded.
.....
2020-02-18 04:06:42.007572: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_24_native_segment
2020-02-18 04:06:42.007581: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 20 nodes (0), 19 edges (0), time = 0.919ms.
2020-02-18 04:06:42.007587: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: Graph size after: 20 nodes (0), 19 edges (0), time = 0.652ms.
2020-02-18 04:06:42.007593: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 20 nodes (0), 19 edges (0), time = 0.765ms.
2020-02-18 04:06:42.007599: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 20 nodes (0), 19 edges (0), time = 0.095ms.
2020-02-18 04:06:42.007605: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 20 nodes (0), 19 edges (0), time = 0.828ms.

The log of calibrate() function is shown as following:

2020-02-18 04:06:43.967599: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e44009490
2020-02-18 04:06:43.967669: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-02-18 04:06:43.968091: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-02-18 04:06:58.017542: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-18 04:06:58.058014: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e3c008780
2020-02-18 04:06:58.079357: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e74007ed0
2020-02-18 04:06:58.152381: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e6c01f3a0
2020-02-18 04:06:58.188232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-18 04:06:58.188977: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e5006f210
2020-02-18 04:06:58.258622: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e6c0279a0
2020-02-18 04:06:58.301596: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e6c027860
2020-02-18 04:06:58.382948: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e6c03b280
2020-02-18 04:06:58.432091: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e6c03fea0
2020-02-18 04:06:58.467576: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e5007d720
....
2020-02-18 04:07:12.796185: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e3c0b45c0
2020-02-18 04:07:12.836554: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e3c0bbc70
2020-02-18 04:07:13.362552: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e3c0c3520
2020-02-18 04:07:14.132879: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e3c0c48f0
2020-02-18 04:07:14.132989: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e4c001180
2020-02-18 04:07:14.331160: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:812] Starting calibration thread on device 0, Calibration Resource @ 0x7f5e3c0e3660

The log as infer as shown:

2020-02-18 04:10:37.066778: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for TRTEngineOp_0 input shapes: [[1,8192,3]]
2020-02-18 04:10:37.066912: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-02-18 04:10:37.067393: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-02-18 04:10:55.899338: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for fp_2/TRTEngineOp_14 input shapes: [[1,256,3]]
2020-02-18 04:10:55.916907: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for fp_1/TRTEngineOp_10 input shapes: [[1,1024,3]]
2020-02-18 04:10:55.988548: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for fp_0/TRTEngineOp_6 input shapes: [[1,8192,3]]
2020-02-18 04:10:56.048204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-18 04:10:56.049951: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for TRTEngineOp_57 input shapes: [[1,64,8192,4]]
2020-02-18 04:10:56.470749: I tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:733] Building a new TensorRT engine for TRTEngineOp_58 input shapes: [[1,64,8192,2]]
....
....

Environment

TensorRT Version: 6
GPU Type: GTX 1660Ti
Nvidia Driver Version: 440.59
CUDA Version: 10.1
CUDNN Version: 7.6.3
Operating System + Version: ubuntu 16.04
Python Version (if applicable): 3.5
TensorFlow Version (if applicable): 1.15.0