onnx/onnx-tensorrt

[Question] Taking a long time to initiate multiple ONNX sessions using the same ONNX model, with TensorrtExecutionProvider

dinushazoomi opened this issue · 0 comments

Description

It is taking a long time (~ 30 minuets) to initiate multiple ONNX sessions using the same ONNX model, but it does not happen when using different models. any idea why this happen?

Reason I am using this is it is very fast when using the ONNX inference with TensorrtExecutionProvider

Steps to reproduce

Here is my code

import onnxruntime as ort

provider = providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] # 

ort_sess_1 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
ort_sess_2 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
ort_sess_3 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)

Environment

I am running this on Jetson Orin

TensorRT Version: 8.5.2.2
**ONNX Runtime Version **: 1.12.1

Another side question is ONNX use trtexec to serialize a tensorRT engine file and then uses it in the ONNX runtime session.

Any help on this matter would be highly appreciated.
Thanks