Online endpoint deployment failing for custom models
obiii opened this issue · 3 comments
Operating System
Linux
Version Information
Python Version: 3.10
SDK: V2
azure-ai-ml package version: 1.8.0
Steps to reproduce
Hi,
I am following the notebook to deploy a model to online endpoint.
While deploying using:
model = Model(path="../model-1/model/sklearn_regression_model.pkl")
env = Environment(
image="acrestmlopsdev.azurecr.io/reg_env_dd2",
)
blue_deployment = ManagedOnlineDeployment(
name="blue",
endpoint_name=online_endpoint_name,
model=model,
environment=env,
instance_type="Standard_F4s_v2",
code_configuration=CodeConfiguration(
code="../model-1/onlinescoring", scoring_script="score.py"
),
instance_count=1,
egress_public_network_access="disabled"
)
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()
the error occurs: A required package azureml-inference-server-http is missing.
The environment we are using is registered in AzureML workspace. Here is how it looks:
The docker and conda dependency file used to create the docker image in ACR is as follows:
Dockerfile:
# Start with a base image, for example:
# FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04
# Use the provided environment variables for conda and environment file paths
FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04
COPY deps.yml conda_env.yml
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo "source /opt/miniconda/etc/profile.d/conda.sh && conda activate" >> ~/.bashrc
RUN cat conda_env.yml
RUN source /opt/miniconda/etc/profile.d/conda.sh && \
conda activate && \
conda install conda && \
pip install cmake && \
conda env update -f conda_env.yml
deps.yml:
name: model-env
channels:
- conda-forge
dependencies:
- python=3.7
- numpy=1.21.2
- pip=21.2.4
- scikit-learn=0.24.2
- scipy=1.7.1
- pip:
- inference-schema[numpy-support]==1.5
- joblib==1.0.1
- azureml-inference-server-http
The deps has azureml-inference-server-http as dependencies, the docker builds fine and AzureMl environment build from docker image is fine.
Expected behavior
Expected behaviour is that the online endpoint deploys properly.
Actual behavior
Gives following error:
2023-10-25T15:41:55,383013296+00:00 | gunicorn/run |
2023-10-25T15:41:55,384232095+00:00 | gunicorn/run | Entry script directory: /var/azureml-app/onlinescoring/.
2023-10-25T15:41:55,385439495+00:00 | gunicorn/run |
2023-10-25T15:41:55,386724694+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,387960893+00:00 | gunicorn/run | Dynamic Python Package Installation
2023-10-25T15:41:55,389318393+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,390611292+00:00 | gunicorn/run |
2023-10-25T15:41:55,392044492+00:00 | gunicorn/run | Dynamic Python package installation is disabled.
2023-10-25T15:41:55,393430091+00:00 | gunicorn/run |
2023-10-25T15:41:55,394692890+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,395941190+00:00 | gunicorn/run | Checking if the Python package azureml-inference-server-http is installed
2023-10-25T15:41:55,397200089+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,398420089+00:00 | gunicorn/run |
2023-10-25T15:41:55,663463169+00:00 | gunicorn/run | A required package azureml-inference-server-http is missing. Please install azureml-inference-server-http before trying again
2023-10-25T15:41:55,666521767+00:00 - gunicorn/finish 100 0
2023-10-25T15:41:55,667702367+00:00 - Exit code 100 is not normal. Killing image
Addition information
No response
An update:
We have successfully resolved the deployment issue; however, the resolution process raised an interesting observation. Despite including all packages in deps.yml file, we encountered deployment failures indicating that the package needed to be installed, even thought the docker built successfully and environment was created from that docker image.
When loading the Azure ML Environment, we specified the conda_file parameter and provided the path to the deps.yml file. For instance:
env = Environment(
conda_file="deps.yml",
image="acrestmlopsdev.azurecr.io/reg_env_dd2",
)
This approach indeed resolved the problem, although it remains somewhat unclear why the specific dependency was required during environment loading when the same dependency was already included in the environment image creation process.
Would be thankful if anyone could shed some light here.
Thanks,
OR
I have the exact same issue when building from a custom docker image... not really sure how it's different from the conda_path approach as well...
Maybe this blog post helps. Similar to what you are trying to do.
I've ran this yesterday and it worked fine. If you bring your own container, afaik you'll need your own inference server, e.g. vllm for serving. I'd suggest to either:
- use base images (see blog post) and then bake everything in via conda.yml - not sure if you can mess with the
Dockerfile
in this case - use a custom container, but then don't use conda in AzureML, see this post, purely do everything in your
Dockerfile
Hope this helps.