[Question] Taking a long time to initiate multiple ONNX sessions using the same ONNX model, with TensorrtExecutionProvider
dinushazoomi opened this issue · 0 comments
dinushazoomi commented
Description
It is taking a long time (~ 30 minuets) to initiate multiple ONNX sessions using the same ONNX model, but it does not happen when using different models. any idea why this happen?
Reason I am using this is it is very fast when using the ONNX inference with TensorrtExecutionProvider
Steps to reproduce
Here is my code
import onnxruntime as ort
provider = providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] #
ort_sess_1 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
ort_sess_2 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
ort_sess_3 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
Environment
I am running this on Jetson Orin
TensorRT Version: 8.5.2.2
**ONNX Runtime Version **: 1.12.1
Another side question is ONNX use trtexec
to serialize a tensorRT engine file and then uses it in the ONNX runtime session.
Any help on this matter would be highly appreciated.
Thanks