aws/sagemaker-huggingface-inference-toolkit

[DOCS] List of available HF_TASK and default inference scripts

austinmw opened this issue · 4 comments

Hi, if for example I deploy the following:

from sagemaker.huggingface import HuggingFaceModel

# Hub Model configuration. https://huggingface.co/models
hub = {
  'HF_MODEL_ID':'distilbert-base-uncased-distilled-squad', # model_id from hf.co/models
  'HF_TASK':'question-answering' # NLP task you want to use for predictions
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.26", # transformers version used
   pytorch_version="1.13", # pytorch version used
   py_version="py39", # python version of the DLC
)

Where can I find the list of available HF_TASK, and the default inference script for each task (if I don't provide a custom one)?


I ask because I'd like to deploy a text embedding model, which I believe is under the task feature-extraction, but I'm not quite sure, and also don't know what the inference script looks like for this.

For more context, I've deployed the hkunlp/instructor-large model with a custom inference script, but am wondering if that is needed or if I could have deployed it without a custom script.

The snippet always use the task the model is fine-tuned on. The distilbert-base-uncased-distilled-squad cannot be used for any other task.
For your example to run instructor models you need their pypi package.

Do I need custom inference code to run the instructor model? I got it to work using the below code in the style of your sagemaker/17_custom_inference_script/sagemaker-notebook.ipynb example:

from InstructorEmbedding import INSTRUCTOR
import torch
import torch.nn.functional as F


def model_fn(model_dir):
  model = INSTRUCTOR(
    model_name_or_path=model_dir,
  )
  return model


def predict_fn(data, model):
    # Get instruction
    instruction = data.pop("instruction", "Represent the query for retrieval: ")

    # Get sentences
    sentences = data.pop("sentences")

    # Pair instruction with each input sentence and encode
    with torch.no_grad():
        embeddings = model.encode([[instruction, sentence] for sentence in sentences])

    # Return dictonary, which will be json serializable
    return {"vectors": embeddings.tolist()}

However I'm curious if I needed to do that, or if there's a way to deploy it without custom code?

The INSTRUCTOR models currently need the INSTRUCTOR package so sadly not.

Thanks