liupeirong/MLOpsManufacturing

AML Observability Library

liupeirong opened this issue ยท 8 comments

Requirements:

  1. The model scoring application should be able to log metrics and trace events to App Insights.
  2. Will also be to nice to have the same code to log metrics to both App Insights and Azure ML during training.

Options for requirement 1:

Options for requirement 2:

Is there a single lib that can be used for both requirements?

Couldn't we use the Observability lib for both?

It looks like in case of scoring you just don't define a Run-Context it will just log to AppInsights:
https://github.com/microsoft/MLOpsPython/blob/c11342302123bf7a812e560726a3a509b937d13b/utils/logger/observability.py#L24-L39

And we could modify it to provide more flags and other loggers too.

Yes, Observability lib can log metrics and text, but can't do tracing, which is useful in the inferencing application.

So how about try to integrate OpenCensus Python extension for Azure Monitor into the Observability lib? Wouldn't that be the tool we want to have? Didn't dig into it if there are technical blockers in doing so.

Integrate the functionality to have a single lib will be ideal. Need to investigate which libs to integrate.

@liupeirong I actually think that we "just" need to add the Traceability option of OpenCensus Python extension for Azure Monitor into the app_insight_logger of this branch from MLOpsPython to achieve tracing. It's not yet on main branch of MLOpsPython, but I'll try and borrow it anyway.

https://github.com/microsoft/MLOpsPython/blob/mamokari/observability/utils/logger/app_insights_logger.py

Made some progress, but far from being finished. But looks quite nice in the Application Map. Relevant code changes (vs MLOpsPython approach) here.

image

Glossary

Term Remark
custom(er) business logic the scripts (mainly python but not only, example here) which need to be wrapped/executed from a python script containing the AML SDK run glue like here. You could think of it as the ML train/prep scripts but some steps could be not directly related to ML so we choose a broader term

Scenario or use case

Latest observations within the IT industry show that Observability will be key for all organizations to stay competitive with their business and tech decisions. As a starting point it is a good way of getting as many data as possible into a monitoring system to identify performance bottlenecks early in system development and test phase as well as be able to track down root cause of production failures in a shorter amount of time.

Goal is to package a pip library able to do custom logs, traces, http call dependencies which will be easy to integrate in the AML pipeline build scripts as well as the AML wrapper scripts and in addition also to the business logic.

The v-team developed a first version based on efforts from MLOpsPython and extended it in a way that AppInsights Application Map can be used for drill/down, error/exception and dependency tracking

Idea for packaging comes from the project teams who successfully shared a logging package on a private package feed

Acceptance criteria

  • Observability lib put into common section
  • CI/CD to publish to pypi
  • Documentation on usage of the observability package (basically everything which needs to be known to update an example, next point)
  • One sample which imports and uses the package for AML scripts and business logic like links above describe

Stretch Goal

  • Refactor all samples to use the package
  • Optimize/clean up Core Observability lib code base