aws/sagemaker-huggingface-inference-toolkit

Where is the logic for detecting custom inference.py?

BaiqingL opened this issue · 6 comments

Trying to deploy a model with custom inference code in the code folder, currently sagemaker complains and wants me to provide a HF task, however I am not seeing where sagemaker loads in inference.py

https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/44e3decd8aab4a710ef5f1094c39818cf7ea0f28/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L108C14-L108C14

Raising this question due to having a custom inference.py with model_fn and transform_fn, but still seeing must define a HF pipeline task

My model.tar.gz anatomy is:

C:.
│   added_tokens.json
│   config.json
│   generation_config.json
│   model-00001-of-00002.safetensors
│   model-00002-of-00002.safetensors
│   model.safetensors.index.json
│   special_tokens_map.json
│   tokenizer.json
│   tokenizer.model
│   tokenizer_config.json
│
└───code
        inference.py
        requirements.txt

Raising this question due to having a custom inference.py with model_fn and transform_fn, but still seeing must define a HF pipeline task

This means your folder structure is wrong. See here on how to create it: https://huggingface.co/docs/sagemaker/inference#create-a-model-artifact-for-deployment

Do you mean that safetensors are not supported?

No. Your model.tar.gz is wrong. follow the steps in the documentation, starting with 2. if you have your model already locally.

Oh! Wrong directory, thanks.