microsoft/onnxconverter-common

FP16 model can not get acceleration on GPU with ONNXRuntime-GPU

yeliang2258 opened this issue · 2 comments

Hello,
I use the float16 tool to convert the FP32 model to the FP16 model and use ONNXRuntime-GPU 1.13.1 to inference.
I found that many models cannot obtain inference acceleration.
I want to know what kind of ONNX FP32 models can obtain inference acceleration after converting to FP16? ?
Looking forward to your answer, thank you

@abock Hi, Please help me to see my problem, thank you

Hi @yeliang2258 ,

We’ve gone ahead and closed this issue because it is stale.
If you still believe this issue is relevant, please feel free to reopen the issue and we will triage it as necessary.

Apologies for not addressing it at the time it was opened.