[EPIC] mlflow integration
shreyashankar opened this issue · 0 comments
shreyashankar commented
A challenge in using the UI when going through historical component runs is that there is no way to know any valuable information about output metric/param values of component runs. To accomplish this, we will do the following:
mlflow
integration User Story
Querying / Debugging
Questions:
- Which data files correspond to biggest performance delta?
- How can we determine performance in the UI?
- User runs
history COMPONENT_NAME
- In each row of the resulting table, we also want to see some performance metric
- Look at the performance metric that's the worst, click on that output filename to trace
- In the ComponentRun info card, we can also see the performance metrics and hyperparameters
Logging / Building Pipeline
Assumptions:
- User uses mlflow functions like
log_param
andlog_metric
- User has created / initialized some experiment & run combination
- mlflow entities: experiment, run, params/metrics
- mltrace entities: component, componentrun, iopointer
We will link an mlflow run with an mltrace ComponentRun.
Regular mlflow
@mltrace.components.Training().run
def training(...):
# init mlflow run
with mlflow.start_run() as run_id:
# do some work
mlflow.log_param(hparam)
# train model
mlflow.log_metric("accuracy": acc)
save(model)
Autolog mlflow
mlflow.keras.autolog()
@mltrace.components.Component(name="training").run
def training(...):
# train in keras
To Dos to enable integration
Storage / DB Layer
- Add integer column that stores mlflow run ID in
ComponentRun
table- Create DB migration script that adds column
- Adding new field and setter method to
ComponentRun
class indb/models.py
API Layer (e.g., what we do in the decorator, functions that the UI will call)
- in
client.py
, create wrapper aroundmlflow.get_run
so the UI can get the relevant stuff to display (flesh out more) - modify component_run method in init_py file to retrieve mlflow metrics & params using mlflow id
- in the decorator (
base_component.py
run method), get the current mlflow run ID usingmlflow.active_run().info.run_id
and log this run ID along with the rest of theComponentRun
Query Layer (what do we show in the UI)
- in
history
view: add a new column to the table that represents mlflow params/metrics - in
ComponentRun
info card: add link to mlflow run ID UI and display some params/metrics