DanaHan/Yolov5-in-Deepstream-5.0

Yolov5s trt engines performance

Opened this issue · 0 comments

Hi @DanaHan, I am getting unexpected performance when running the Yolov5s trt engines with trtexec vs DeepStream, and even when running BS=1 versus BS>1. Please see below:

TensorRT: 7.2.1
DeepStream: 5.1

With trtexec and BS=1:
$ LD_PRELOAD=build/libmyplugins.so /usr/src/tensorrt/bin/trtexec --loadEngine=yolov5s<precision>.engine
Performance:

  • TRT-FP32: 179.7 qps
  • TRT-FP16: 396.9 qps
  • TRT-INT8 : 469.9 qps

With DeepStream and BS=1:

  • TRT-FP32: 109.1 FPS
  • TRT-FP16: 91.6 FPS
  • TRT-INT8: 88.9 FPS

TRT-INT8 (built with BS=8) |

  • run with trtexec and BS=1 : 469.9 qps | run with DS and BS=1 : 88.9 FPS
  • run with trtexec and BS=8: 753.3 qps | run with DS and BS=8 : 42.7 FPS

What is the formula to convert qps to FPS?. What parameters do I need to set in the DeepStream config files to fix the performance issues?,