how to send input request to the loaded bert_pt model
Opened this issue · 0 comments
VJAYSLN commented
Hi @rkoystart ,
I read your issue on triton-inference-server repo. There you mentioned that,
@CoderHam It is running properly in case of single gpu. The above mentioned error comes when I run it on multiple gpus.
I too face this same issue on CPU (Not GPUs). Did you fix that?
Do you have any idea about this on CPU?
Many thanks in Advance :)