aws/sagemaker-huggingface-inference-toolkit

Using custom inference script and models from Hub

Tarun02 opened this issue · 1 comments

Hello,

Can we use custom inference script without downloading the model to the S3 ? From the link, it says that we need to download the model artifacts and push them to S3 before using the custom inference script.

This will add considerable overhead depending on the model size.

+1 if I have to download the weights and repackage them, it kind of defeats the purpose of this library. The embeddings models "feature-extraction" return the last hidden state which is missing a step. The only thing I can do is repackage everything. I can't even add a file "code/inference.py" to the huggingface repository as the FILE_LIST_NAMES prevents it from being downloaded.