Provide guidance on managing AML environments
Closed this issue · 1 comments
Take a look at the dependency management section of the image-classification-tensorflow
sample. Provide additional best practice and sample.
Glossary
Term | Remark |
---|---|
AML environments | What are Azure Machine Learning environments? |
Scenario or use case
Every new MLOps project will bring in another set of library dependencies and requirements to the runtime environment (e.g. CPU, GPU, FPGA, MPI etc) which need to be in place for AML pipeline steps to be executed. AMLS has the notion of environments which is a description of (mostly) conda/pip packages and preinstalled driver needs like GPU or MPI. Internally AMLS is creating a docker image with all requirements in place before mounting the project scripts to be executed there. This image will be associated to the AML environment so that it can be reused without rebuilding it every time. For more complex scenarios a custom docker image can be created and referenced as well.
One finding of the project teams in the field was that there are very limited documentation and guidance on how to efficiently create and specify these AML environments within the AML pipeline build scripts. In addition it would be great to get a best practice manual how to do a dependency analysis (e.g. for identifying pip packages and conda environments) to avoid trial/error approaches of creating the needed AML environments.
This is about documenting an approach which would have allowed the project teams to spend less time with creating the AML environments
Creation of AML environments is part of wrapping the business logic, so this is related to #38 but focusing more on the AML environments.
Acceptance criteria
- Documentation which explains AML environments in more detail or better understandable than in the official docs
- Documentation which explains how to efficiently specify AML environments with the AML Python SDK and/or how to use the manage_environment.py utililty inspired by MLOpsPython
- Documentation/Best Practice on how to do a dependency analysis of the business logic wrapped into AML
Stretch Goal
- Optimize and add features to manage_environment.py utililty (just an idea)
- Provide manage_environment.py utililty as a package (just an idea)