liupeirong/MLOpsManufacturing

Provide guidance on managing AML environments

Closed this issue · 1 comments

Take a look at the dependency management section of the image-classification-tensorflow sample. Provide additional best practice and sample.

Glossary

Term Remark
AML environments What are Azure Machine Learning environments?

Scenario or use case

Every new MLOps project will bring in another set of library dependencies and requirements to the runtime environment (e.g. CPU, GPU, FPGA, MPI etc) which need to be in place for AML pipeline steps to be executed. AMLS has the notion of environments which is a description of (mostly) conda/pip packages and preinstalled driver needs like GPU or MPI. Internally AMLS is creating a docker image with all requirements in place before mounting the project scripts to be executed there. This image will be associated to the AML environment so that it can be reused without rebuilding it every time. For more complex scenarios a custom docker image can be created and referenced as well.

One finding of the project teams in the field was that there are very limited documentation and guidance on how to efficiently create and specify these AML environments within the AML pipeline build scripts. In addition it would be great to get a best practice manual how to do a dependency analysis (e.g. for identifying pip packages and conda environments) to avoid trial/error approaches of creating the needed AML environments.

This is about documenting an approach which would have allowed the project teams to spend less time with creating the AML environments

Creation of AML environments is part of wrapping the business logic, so this is related to #38 but focusing more on the AML environments.

Acceptance criteria

Stretch Goal