[EPIC] mlflow integration

Question

[EPIC] mlflow integration

shreyashankar opened this issue 3 years ago · 0 comments

A challenge in using the UI when going through historical component runs is that there is no way to know any valuable information about output metric/param values of component runs. To accomplish this, we will do the following:

`mlflow` integration User Story

Querying / Debugging

Questions:

Which data files correspond to biggest performance delta?
How can we determine performance in the UI?

User runs history COMPONENT_NAME
In each row of the resulting table, we also want to see some performance metric
Look at the performance metric that's the worst, click on that output filename to trace
In the ComponentRun info card, we can also see the performance metrics and hyperparameters

Logging / Building Pipeline

Assumptions:

User uses mlflow functions like log_param and log_metric
User has created / initialized some experiment & run combination
mlflow entities: experiment, run, params/metrics
mltrace entities: component, componentrun, iopointer

We will link an mlflow run with an mltrace ComponentRun.

Regular mlflow

@mltrace.components.Training().run
def training(...):
  # init mlflow run
  with mlflow.start_run() as run_id:
    # do some work
    mlflow.log_param(hparam)
    # train model
    mlflow.log_metric("accuracy": acc)
    save(model)

Autolog mlflow

mlflow.keras.autolog()

@mltrace.components.Component(name="training").run
def training(...):
  # train in keras

To Dos to enable integration

Storage / DB Layer

Add integer column that stores mlflow run ID in ComponentRun table
- Create DB migration script that adds column
- Adding new field and setter method to ComponentRun class in db/models.py

API Layer (e.g., what we do in the decorator, functions that the UI will call)

in client.py, create wrapper around mlflow.get_run so the UI can get the relevant stuff to display (flesh out more)
modify component_run method in init_py file to retrieve mlflow metrics & params using mlflow id
in the decorator (base_component.py run method), get the current mlflow run ID using mlflow.active_run().info.run_id and log this run ID along with the rest of the ComponentRun

Query Layer (what do we show in the UI)

in history view: add a new column to the table that represents mlflow params/metrics
in ComponentRun info card: add link to mlflow run ID UI and display some params/metrics

mlflow integration User Story