The Python-Based Custom Runtime with MLServer can not support to deploy a model stored on a Persistent Volume Claim
zhlsunshine opened this issue · 0 comments
Describe the bug
I store the trained model (I use mnist-svm.joblib
in my case) on a PVC, and I have some extra logics to handle the trained model after it is loaded. Therefore, I need to write a custom ServingRuntime to handle it.
To Reproduce
It works well when I follow the doc: Deploy a model stored on a Persistent Volume Claim, I can see the model file mnist-svm.joblib
and model-settings.json
under the folder of /models/_mlserver_models/
, showing as below:
However, I want to write a custom ServingRuntime, so I go on to follow the doc: Python-Based Custom Runtime with MLServer, then create a new ServingRuntime and create an InferenceService for this ServingRuntime. After all these actions, everything is okay, except the inference service can not be True
due to "NOT_FOUND" error in inference service, showing below:
Based on the comparison, I think there should be something wrong with the Python-Based Custom Runtime with MLServer
when using Persistent Volume Claim
to store a trained model.
Expected behavior
Hope that there is an explicit demo to show how to support the use case of python-based custom runtime with MLServer with a model stored on a Persistent Volume Claim. Thanks a lot if it's possible!