zenml-io/zenml-dashboard

I'm unable to deploy a ML model locally to MLFlow using zenml

PriyanshBhardwaj opened this issue · 5 comments

Checks

  • [*] I added a descriptive title to this issue
  • [*] I have searched (google, github) for similar issues and couldn't find anything
  • [*] I have read and followed the docs and still think this is a bug

General

I followed all steps correctly, set the experiment tracker, model deployer and created the stack:

zenml experiment-tracker register mlflow_tracker --flavor=mlflow

zenml model-deployer register mlflow --flavor=mlflow

zenml stack register mlflow_stack -a default -o default -d mlflow -e mlflow_tracker --set

i created a pipeline which will ingest data, train the model, evaluate performance and then deploy model after passing the trigger:

df = ingest_data(data_path = data_path)
x_train, x_test, y_train, y_test = clean_data(df)
model = train_model(x_train, x_test, y_train, y_test)
r2, mse, rmse = evaluate_model(model, x_test, y_test)
deployment_decision = deployment_trigger(accuracy=r2)      #deploying on the basis of mse score
mlflow_model_deployer_step(
    model=model,
    deploy_decision=deployment_decision,
    workers=workers,
    timeout=timeout,
)

I debugged it, everything is working good and the model is also passing the deployment trigger but the model deployer is not working properly. the problem is with this step:

mlflow_model_deployer_step(
    model=model,
    deploy_decision=deployment_decision,
    workers=workers,
    timeout=timeout,
)

when pipeline calls it, the log which prints is:

Updating an existing MLflow deployment service: MLFlowDeploymentService[2ade1153-7fd3-45d1-8ecd-412f86b264b5] (type: model-serving, flavor: mlflow)

this gets logged from function deploy_model() which is in the file /zenml/integrations/mlflow/model_deployers/mlflow_model_deployer.py . In the same function at line 210 it calls service.start() which is in the file /zenml/services/local/local_service.py and in the same under the start function it logs Starting service 'MLFlowDeploymentService[2ade1153-7fd3-45d1-8ecd-412f86b264b5] (type: model-serving, flavor: mlflow)'. from line 387 and then when it calls if not self.poll_service_status(timeout): at line 391 it logs error:

Timed out waiting for service MLFlowDeploymentService[2ade1153-7fd3-45d1-8ecd-412f86b264b5] (type: model-serving, flavor: mlflow) to become active:
  Administrative state: active
  Operational state: inactive
  Last status message: 'service daemon is not running'
For more information on the service status, please see the following log file: 

when i visited the log file it says:

TypeError: Cannot load service with unregistered service type:
type='model-serving' flavor='mlflow' name='mlflow-deployment'
description='MLflow prediction service'

it raises this issue from here: /zenml/services/service_registry.py:193 in load_service_from_dict

but in the file mlflow_model_deployer.py at line 187 service gets its value from here service = cast(MLFlowDeploymentService, existing_service) which is MLFlowDeploymentService[2ade1153-7fd3-45d1-8ecd-412f86b264b5] (type: model-serving, flavor: mlflow).

but in class MLFlowDeploymentService in file mlflow_deployment.py at line 128 type is already defined as "model-serving" which i think cant be changed:

SERVICE_TYPE = ServiceType(
        name="mlflow-deployment",
        type="model-serving",
        flavor="mlflow",
        description="MLflow prediction service",
    )

It is getting timed out due to this because MLFlowDeploymentService[2ade1153-7fd3-45d1-8ecd-412f86b264b5] (type: model-serving, flavor: mlflow) is not starting and it will always give timeout error doesn't matter what will be the value of timeout.

so why it is giving this error in log file

TypeError: Cannot load service with unregistered service type:

I tried everything to solve it:
created new stack from scratch
created new pipeline in different stacks
initialize zenml again by deleting .zen folder and again calling zenml init in terminal in same directory

i also tried to use --type=mlflow while creating model deployer as mentioned somewhere in your old docs in this command:

zenml model-deployer register mlflow --type=mlflow --flavor=mlflow

It obviously didnt work.

Nothing worked and also your docs doesnt have a solution for such problems.

My issue is I tried everything to debug this but not able to deploy my model locally to mlflow bcz i cant change type in your internal class which is the root cause. Please resolve the issue and please update your logs and make them more clear for the users.

P.S: model size 1000 bytes only, timeout 60, 120 didnt work for both values
I'm using a separate env for this.
latest version of zenml and mlflow

@PriyanshBhardwaj Thank you for this very detailed report. Seems like you found a bug - We'll investigate and get back to you ASAP

@PriyanshBhardwaj Ah I just realized you have reported it on the wrong repository. Please create a new issue in the ZenML repository, not the dashboard one.

@strickvl Tagging your visibility, this seems like a good investigating of the mlflow issue we've been having with other users

ok, i've added the issue to zenml repo zenml-io/zenml#2235

Closing this issue in favor of the zenml repository one. Thanks for the report @PriyanshBhardwaj and we'll investigate it.