Online endpoint deployment failing for custom models

Question

Online endpoint deployment failing for custom models

obiii opened this issue a year ago · 3 comments

obiii commented a year ago

Operating System

Linux

Version Information

Python Version: 3.10
SDK: V2
azure-ai-ml package version: 1.8.0

Steps to reproduce

Hi,

I am following the notebook to deploy a model to online endpoint.

While deploying using:

model = Model(path="../model-1/model/sklearn_regression_model.pkl")
env = Environment(
    image="acrestmlopsdev.azurecr.io/reg_env_dd2",
)

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    model=model,
    environment=env,
    instance_type="Standard_F4s_v2",
    code_configuration=CodeConfiguration(
        code="../model-1/onlinescoring", scoring_script="score.py"
    ),
    instance_count=1,
    egress_public_network_access="disabled"
)
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

the error occurs: A required package azureml-inference-server-http is missing.

The environment we are using is registered in AzureML workspace. Here is how it looks:

The docker and conda dependency file used to create the docker image in ACR is as follows:

Dockerfile:

# Start with a base image, for example:
# FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04

# Use the provided environment variables for conda and environment file paths

FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04

COPY deps.yml conda_env.yml

RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo "source /opt/miniconda/etc/profile.d/conda.sh && conda activate" >> ~/.bashrc

RUN cat conda_env.yml

RUN source /opt/miniconda/etc/profile.d/conda.sh && \
    conda activate && \
    conda install conda && \
    pip install cmake && \
    conda env update -f conda_env.yml

deps.yml:

name: model-env
channels:
  - conda-forge
dependencies:
  - python=3.7
  - numpy=1.21.2
  - pip=21.2.4
  - scikit-learn=0.24.2
  - scipy=1.7.1
  - pip:
    - inference-schema[numpy-support]==1.5
    - joblib==1.0.1
    - azureml-inference-server-http

The deps has azureml-inference-server-http as dependencies, the docker builds fine and AzureMl environment build from docker image is fine.

Expected behavior

Expected behaviour is that the online endpoint deploys properly.

Actual behavior

Gives following error:

2023-10-25T15:41:55,383013296+00:00 | gunicorn/run |
2023-10-25T15:41:55,384232095+00:00 | gunicorn/run | Entry script directory: /var/azureml-app/onlinescoring/.
2023-10-25T15:41:55,385439495+00:00 | gunicorn/run |
2023-10-25T15:41:55,386724694+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,387960893+00:00 | gunicorn/run | Dynamic Python Package Installation
2023-10-25T15:41:55,389318393+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,390611292+00:00 | gunicorn/run |
2023-10-25T15:41:55,392044492+00:00 | gunicorn/run | Dynamic Python package installation is disabled.
2023-10-25T15:41:55,393430091+00:00 | gunicorn/run |
2023-10-25T15:41:55,394692890+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,395941190+00:00 | gunicorn/run | Checking if the Python package azureml-inference-server-http is installed
2023-10-25T15:41:55,397200089+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,398420089+00:00 | gunicorn/run |
2023-10-25T15:41:55,663463169+00:00 | gunicorn/run | A required package azureml-inference-server-http is missing. Please install azureml-inference-server-http before trying again
2023-10-25T15:41:55,666521767+00:00 - gunicorn/finish 100 0
2023-10-25T15:41:55,667702367+00:00 - Exit code 100 is not normal. Killing image

Addition information

No response

Answer 1 · 2023-10-26T10:58:22.000Z

An update:

We have successfully resolved the deployment issue; however, the resolution process raised an interesting observation. Despite including all packages in deps.yml file, we encountered deployment failures indicating that the package needed to be installed, even thought the docker built successfully and environment was created from that docker image.

When loading the Azure ML Environment, we specified the conda_file parameter and provided the path to the deps.yml file. For instance:


env = Environment(
    conda_file="deps.yml",
    image="acrestmlopsdev.azurecr.io/reg_env_dd2",
)

This approach indeed resolved the problem, although it remains somewhat unclear why the specific dependency was required during environment loading when the same dependency was already included in the environment image creation process.

Would be thankful if anyone could shed some light here.

Thanks,
OR

Answer 2 · 2024-08-16T04:48:07.000Z

I have the exact same issue when building from a custom docker image... not really sure how it's different from the conda_path approach as well...

Answer 3 · 2024-10-23T11:54:00.000Z

Maybe this blog post helps. Similar to what you are trying to do.

I've ran this yesterday and it worked fine. If you bring your own container, afaik you'll need your own inference server, e.g. vllm for serving. I'd suggest to either:

use base images (see blog post) and then bake everything in via conda.yml - not sure if you can mess with the Dockerfile in this case
use a custom container, but then don't use conda in AzureML, see this post, purely do everything in your Dockerfile

Hope this helps.