aws/sagemaker-huggingface-inference-toolkit

Support of Multi-model Endpoint using HuggingFace inference toolkit

Tarun02 opened this issue · 6 comments

Hi,

Does SageMaker huggingface inference toolkit supports multi-model endpoint ?

Yes, the HuggingFace Inference Toolkit uses similar mechanism than the other ones. You can find documentation here:

Hi @philschmi,

thanks for the resources !!
in this https://www.philschmid.de/sagemaker-huggingface-multi-container-endpoint you mention that multi-container endpoints are not possible on GPU ??

is this true for multi-modal endpoint also ??

Yes, currently Multi-Model Endpoint also only works with CPU instances

thanks @philschmid you saved me sometime

Hello!

I'm interested in helping out with this. That said, what's the main blocker on implementing support for GPU MME?

Why wouldn't it suffice to just add the multi-models tag?

Second this question. It has been 1.5 years since the original question, yet is GPU still not supported?