SageMaker deployment errors
jonrossclaytor opened this issue · 2 comments
Background
We are attempting to deploy SageMaker Endpoints using the code provided under Deploy - Amazon SageMaker from huggingface.co for these two models:
https://huggingface.co/Salesforce/codegen25-7b-multi
https://huggingface.co/openchat/opencoderplus
Error
Both endpoints consistently fail to deploy. Both fail health checks - error logs available on request as it does not appear I can attach them here.
@philschmid is there any guidance you can provide on these errors?
Currently, all models with sharded checkpoints such as these are failing to deploy, as this library is filtering out files that don't match a predefined allowlist, and the sharded format isn't included in that list.
I've made a PR that fixes this issue in #93 but until it gets merged you might be able to get by by building a custom docker image with my fork, like so:
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.0.0-transformers4.28.1-gpu-py310-cu118-ubuntu20.04
RUN pip install --no-cache-dir \
git+https://github.com/JimAllanson/sagemaker-huggingface-inference-toolkit@sharded-checkpoint-support