does anyone try to deploy this great repo in jetson orin? Unsupported operator Gridsampler2DTRT

Question

does anyone try to deploy this great repo in jetson orin? Unsupported operator Gridsampler2DTRT

sainttelant opened this issue a year ago · 10 comments

first of all, thanks for this great job, i 've deployed this project in x86 ubuntu system successfully, however, i encountered lots of problems when i tried to deploy it in jetson orin pack. for example, it showed Unsupported operator Gridsampler2DTRT during running test_trt_ops.py even i installed onnx 1.12.0, torch 1.12.0 in orin jetson pack. i noticed that someone has encountered similar problem as i showed, it was solved via installed onnx1.12 and torch 1.12 synchronously. but the difference is that i installed tensorrt8.5.2.2 in orin!!

i don't know whether the difference of tensorrt version would result in such issue, by the way, i also encountered nan_to_num plugin issue as well. thanks alot, if you can help me

Answer 1 · 2023-06-07T15:09:57.000Z

It looks like the custom plugin dynamic link library is not loaded properly. Are you sure loaded BEVFormer_tensorrt/TensorRT/build/libtensorrt_ops.so?

Answer 2 · 2023-06-15T02:38:38.000Z

@DerryHub thanks for your reply， i am sure that i 've already loaded *ops.so successfully judged via logger's printing, by the way, i have already addressed such issue, i substitude this nan_to_num function by using np.nan_to_num for original function in that of torch, and then it converted onnx into trt successfully, thanks a lot any way

Answer 3 · 2024-03-09T15:20:29.000Z

nan_to_num

Hello, which side did you modify to solve the problem?
I try to modify the encoder of det2trt, and the trt can be converted successfully, but the indicator result is incorrect.

Answer 4 · 2024-03-11T02:50:06.000Z

@Alex-fishred it is really a long time issue i encountered, i don't remember where i modified in code exactly, but i have two options for you, 1, write the declaration of nan_to_num plugin in hpp and definition of nan_to_num plugin in CPP, and register it in "So" appropriately. 2, option2, substitute it with numpy version of nan_to_num instead of torch version of "nan_to_num". But it will result in the decrease of inference accuracy, you have to figure out a way to solve it.

Answer 5 · 2024-03-11T02:55:53.000Z

Thank you for your reply, thank you very much. I have tried np's nan_to_num, and it can indeed be successfully inferred but the accuracy chance is 0. Do you have any suggestions on how to correct the accuracy? I probably won’t implement option 1 because I don’t know C++ very well.

Answer 6 · 2024-03-18T05:29:10.000Z

Thank you for your reply, thank you very much. I have tried np's nan_to_num, and it can indeed be successfully inferred but the accuracy chance is 0. Do you have any suggestions on how to correct the accuracy? I probably won’t implement option 1 because I don’t know C++ very well.

i also have no idea about how to increase the accuracy, sorry about that, it seems that it is probably helpful to do post-nms, or filter results by tuning threshold of confidence.... Do you have any progress about this issue?

Answer 7 · 2024-07-23T10:31:42.000Z

Hello, @sainttelant what do you mean by installing onnx1.12 and torch 1.12 synchronously? I also have the same problem unsupported operators even when I have built custom plugins, mmdeploy.

Answer 8 · 2024-07-24T08:25:42.000Z

@vietnguyen012 , you have to install the corresponding versions onnx1.12 and torch1.12 in your Jetson devices, otherwise you will encounter many incompatible ops.

Answer 9 · 2024-07-29T06:58:55.000Z

@sainttelant have you converted successfully with custom plugins? can you share with me your tensorrt, onnx, torch versions on jetson orin? also the orin version, too:). Thank you very much!

Answer 10 · 2024-07-31T02:40:04.000Z

@vietnguyen012 actually, i 've written my own custom plugins, and i am sure i 've registered the plugins there, of course, it converted successfully, however, it still couldn't find my plugins when i executed the instances of the functions, it is weird, finally, i replaced an ops called num_to_nan as numply function, it can execute the inference at last, however the accuracy decreased too much. i will show you my docker image here, you can refer to it. https://hub.docker.com/repository/docker/sainttelant/bevformer_xw_tensorrt/general