aws/sagemaker-huggingface-inference-toolkit

Error on Sagemaker deployment for v1.0.1

mer0mingian opened this issue · 1 comments

Rope is very useful for me, so i tried to naively use the Docker image from ghcr.io/huggingface/text-generation-inference:1.0.1 as a custom image for deploying meta-llama/Llama-2-13b-chat-hf as in this notebook - that is I pushed the image to a private ECR and use the uri for the HuggingFaceModel.

CloudWatch lists this error:

error: unexpected argument 'serve' found | error: unexpected argument 'serve' found

My assumption is that something is passed as generic argument during deployment and is directed at the wrong endpoint. Also, I cannot find the (sagemaker-)entrypoint.sh when attaching the container either, so maybe this is the problem. Am I missing an additional step to configure the image for use with Sagemaker?

Sorry, wrong repo for reporting