TensorRT model output mismatch with onnx model output

Question

TensorRT model output mismatch with onnx model output

Roujack opened this issue 3 years ago · 4 comments

Hi, thanks for sharing the code. I follow your instruction and convert the pytorch model to onnx and tensorrt model. Tested by deploy/images/demo.jpg, the onnx output is almost the same as pytorch output, however, the tensorrt output is wrong:
pytorch output:

onnx output:

tensorrt output:

also, I print the result of onnx model and tensorrt model, which are mask, category and score respectively:

onnx result:

tensorrt result:

Obviously, the score is mismatch. My question is what cause this mismatch? Can you tell me how to solve it ?

the environment setting is the same with yours:
CUDA=10.2
cudnn=8.0.5
tensorrt==7.2.1.6
pytorch==1.4.0
torchvision==0.5.0
onnx==1.8.0
onnxruntime==1.6.0
onnx-tensorrt==7.2.1

Answer 1 · 2021-11-11T05:44:09.000Z

you can try to convert half of the model to onnx and tensorRT to see if tensorRT runs correctly! Thus finding out which layer goes wrong!

Answer 2 · 2021-12-25T13:28:13.000Z

I ran into the same problem as you, until I converted tensorrt to float32. If you need float16 see my two outside one answer #10

Answer 3 · 2021-12-28T07:05:22.000Z

I solved this problem by set fpn upsample mode to bilinear. I don't know why nearest upsample will cause such problem.

Answer 4 · 2022-09-22T11:04:01.000Z

Hi @Roujack , I was able to successfully convert to onnx and then for inference used the 'inference_on_onnxrt_trt_solov2.py' provided in deploy for inference and I am facing this issue:

`/content/SOLO/mmdet/models/anchor_heads/solov2_head.py in get_seg_single(self, cate_preds, seg_preds, kernel_preds, featmap_size, img_shape, ori_shape, scale_factor, cfg, rescale, debug)
398 rescale=False, debug=False):
399
--> 400 assert len(cate_preds) == len(kernel_preds)
401
402 # overall info.

AssertionError: `

Could you please let me know how did you get the inference on the onnx model. Thanks!!