aws/sagemaker-huggingface-inference-toolkit

Adjust request body data structure for numpy.ndarray for pipeline.

wildgeece96 opened this issue · 7 comments

It's related to huggingface/transformers#19743.

I think converting logic should be implemented.

In this toolkit's implementation, request body type should be json dict, but many pipelines does not accept list but np.ndarray. With SageMaker's serializer we cannot realize that.

Hello @wildgeece96,

The Hugging Face inference toolkit supports all the transformers pipelines with their default inputs. The Toolkit implements several serializers to parse binary data, e.g., audio or images to the matching format for the transformers pipeline, e.g., PIL or np.

For everything else, you can go with the default SageMaker way with implementing a model_fn, input_fn ...

Thanks, @philschmid .

Oh, I didn't know that. Thanks.
Anyway, I am trying to use AutomaticSpeechRecognitionPipeline which takes the inputs as dict with numpy.ndarray like {""sampling_rate": int, "raw": np.array}. In that case is there any way to deal with that?

And I have a question. How can I use these encoder, decoders?

I want to pass np.ndarray as an input,

from sagemaker.huggingface import HuggingFaceModel
import sagemaker
import numpy as np

role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
    # 'HF_MODEL_ID':'openai/whisper-base',
    'HF_MODEL_ID': 'facebook/wav2vec2-base-960h',
    'HF_TASK':'automatic-speech-recognition'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    transformers_version='4.17.0',
    pytorch_version='1.10.2',
    py_version='py38',
    env=hub,
    role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1, # number of instances
    instance_type='ml.m5.xlarge' # ec2 instance type
)

input_array = np.random.randn(1, 10000)
predictor.predict({
    'inputs': input_array
})

Wow!
That's a great example. Thanks.

I hope we can use that code from HuggingFace's page such as https://huggingface.co/facebook/wav2vec2-large-960h

Do you know where to commit in order to change auto generated code for automatic speech recognition example at HuggingFace model's page?

Hi @wildgeece96

Have you managed to call a Sagemaker Endpoint with JSON data using inputs as dict with numpy.ndarray like in our example ? {"sampling_rate": int, "raw": np.array}.

Actually, I'm facing the same issue because I'm using AWS prebuilt deep learning containers for HugginFace models and if I send JSON data to the Endpoint, the container is using a simple Json.load decoder and then the models fails because it get a List instead of a np.array for Data.