NVIDIA/trt-samples-for-hackathon-cn

dense model using tensorrt infernece on the A30 got wrong

zhaohb opened this issue · 0 comments

zhaohb commented

Environment

  • TensorRT 8.6.1

  • Versions of CUDA(12.1), CUBLAS(12.1.3.1)

  • Container used (tensorrt:23.07-py3)

  • NVIDIA driver version (510.47.03)

Model:

Reproduction Steps

  1. A30
polygraph run gs_concat.onnx --onnxrt --trt --tf32 --atol 1e-4 --pool-imit workspace:10G

output:
image

Expected Behavior

  • The result of inference using tensorrt is correct

Actual Behavior

  • It can be seen that there is a certain gap between the output of trt and onnx, and the inference result is wrong.

Additional Notes

  • We also tested it on the A6000 and found it correct,
polygraph run gs_concat.onnx --onnxrt --trt --tf32 --atol 1e-4 --pool-imit workspace:10G

output:
image

So maybe it's hardware related.

bug已经经过导师确认,nv内部的bug id是4259240.