[Q] GPU support

Question

[Q] GPU support

oonisim opened this issue 4 years ago · 3 comments

AWS documentation (https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html) tells "Multi-model endpoints are not supported on GPU instance types.".

Kindly explain if it is not technically possible or not yet implemented.

Answer 1 · 2022-01-23T15:24:22.000Z

Hi @oonisim

Do you know, how can we get the inference from multi-model endpoints which require GPU memory?

Thanks

Answer 2 · 2022-01-23T19:32:05.000Z

Hi @Vinayaks117 , As per AWS documentation (https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html] "Multi-model endpoints are not supported on GPU instance types", not sure if you can run multi model server (please see the AWS github for the multi model server implementation, and I believe it is framework e.g. PyTorch, TF dependent) on GPU instances. Please open a case to AWS support for a correct answer. I am afraid it is the only way.

Answer 3 · 2022-01-24T09:38:28.000Z

Sure Thanks @oonisim