AML Observability Library

Question

AML Observability Library

liupeirong opened this issue 4 years ago · 8 comments

liupeirong commented 4 years ago

Requirements:

The model scoring application should be able to log metrics and trace events to App Insights.
Will also be to nice to have the same code to log metrics to both App Insights and Azure ML during training.

Answer 1 · 2020-11-10T22:46:32.000Z

Options for requirement 1:

OpenCensus Python extension for Azure Monitor

Options for requirement 2:

Observability lib

Is there a single lib that can be used for both requirements?

Answer 2 · 2020-11-11T00:55:14.000Z

Couldn't we use the Observability lib for both?

It looks like in case of scoring you just don't define a Run-Context it will just log to AppInsights:
https://github.com/microsoft/MLOpsPython/blob/c11342302123bf7a812e560726a3a509b937d13b/utils/logger/observability.py#L24-L39

And we could modify it to provide more flags and other loggers too.

Answer 3 · 2020-11-11T20:15:26.000Z

Yes, Observability lib can log metrics and text, but can't do tracing, which is useful in the inferencing application.

Answer 4 · 2020-11-12T04:08:09.000Z

So how about try to integrate OpenCensus Python extension for Azure Monitor into the Observability lib? Wouldn't that be the tool we want to have? Didn't dig into it if there are technical blockers in doing so.

Answer 5 · 2020-11-12T21:29:47.000Z

Integrate the functionality to have a single lib will be ideal. Need to investigate which libs to integrate.

Answer 6 · 2020-11-20T02:54:07.000Z

@liupeirong I actually think that we "just" need to add the Traceability option of OpenCensus Python extension for Azure Monitor into the app_insight_logger of this branch from MLOpsPython to achieve tracing. It's not yet on main branch of MLOpsPython, but I'll try and borrow it anyway.

https://github.com/microsoft/MLOpsPython/blob/mamokari/observability/utils/logger/app_insights_logger.py

Answer 7 · 2020-11-20T08:32:24.000Z

Made some progress, but far from being finished. But looks quite nice in the Application Map. Relevant code changes (vs MLOpsPython approach) here.

Answer 8 · 2021-01-13T07:51:22.000Z

Glossary

Term	Remark
custom(er) business logic	the scripts (mainly python but not only, example here) which need to be wrapped/executed from a python script containing the AML SDK run glue like here. You could think of it as the ML train/prep scripts but some steps could be not directly related to ML so we choose a broader term

Scenario or use case

Latest observations within the IT industry show that Observability will be key for all organizations to stay competitive with their business and tech decisions. As a starting point it is a good way of getting as many data as possible into a monitoring system to identify performance bottlenecks early in system development and test phase as well as be able to track down root cause of production failures in a shorter amount of time.

Goal is to package a pip library able to do custom logs, traces, http call dependencies which will be easy to integrate in the AML pipeline build scripts as well as the AML wrapper scripts and in addition also to the business logic.

The v-team developed a first version based on efforts from MLOpsPython and extended it in a way that AppInsights Application Map can be used for drill/down, error/exception and dependency tracking

Idea for packaging comes from the project teams who successfully shared a logging package on a private package feed

Acceptance criteria

Observability lib put into common section
CI/CD to publish to pypi
Documentation on usage of the observability package (basically everything which needs to be known to update an example, next point)
One sample which imports and uses the package for AML scripts and business logic like links above describe

Stretch Goal

Refactor all samples to use the package
Optimize/clean up Core Observability lib code base