aws/sagemaker-inference-toolkit

Deploying multiple model artifacts, each having their own inference handler

Opened this issue · 5 comments

What did you find confusing? Please describe.
I am trying to deploy multiple tarball model artifacts, to a SageMaker multi-model endpoint, but would like to use different inference handlers for each model - since each model needs different pre-processing and post-processing.

Describe how documentation can be improved
I see the documentation is fairly clear on how to specify a custom inference handler, but not clear on whether differing custom handlers can be specified for each model.

Additional context
I discovered that a custom handler can be provided to the MMS model archiver here, but it's not clear if this allows different handlers for each model.

I love the inference toolkit, and would sincerely appreciate a response regarding whether it is possible to define differing inference handlers per model, and how to do so.

thanks for the kind words! Unfortunately, this isn't currently supported at this time, but I'll leave this issue open as a feature request.

Took a long time to figure out from reading the code that this isn't supported. Was experimenting with single models first and then wanted to move towards multi-model and unfortunate that it isn't supported.

+1 for this

+1, it's super confusing how this is supposed to all fit together. One would assume that Sagemaker would support the same functionality of the Multi Model Server in the Inference Toolkit. Seems like the only option is to use separate endpoints after all.

any update on this? I'm trying to achieve similar thing. it will be great to have this toolkit support multiple models with each model have its own inference code.