This repository contains Docker containers used in Azure Machine Learning Python SDK.
- Introduction
- How Azure ML prepares an image for training
- Dependencies
- How to get Azure ML Docker Containers
- Featured Tags
- How to run an Azure ML experiment
These Docker images are used for training runs submitted via Azure ML. While submitting a training run on AmlCompute or any other target with Docker enabled, Azure ML runs your job in a conda environment within a Docker container with several dependencies installed. You can also specify any extra dependencies to be installed using pip_packages
, pip_requirements_file_path
, conda_dependencies_file
and conda_packages
parameters. The extra dependencies are installed on top of the dependencies in the Docker image. If you are using the estimators specific for DNN training, the DNN framework related dependencies are also installed. You can control the version of the DNN installed using the framework_version
parameter. If no version is specified, default version will be used.
For example, if you are using the PyTorch Estimator, these are steps in the background.
- If no custom_docker_image was specified, Azure ML decides between CPU base images and GPU base images based on the
use_gpu
flag. - Based on the
framework_version
Azure ML selects the list of framework based dependencies to be added. If not specified, the default framework version will be used. - For the PyTorch estimator the framework specific dependencies are torch==1.0, torchvision==0.2.1 and horovod==0.15.2
- The training happens on a Docker image built with all these dependencies. A new Docker image is built if this is the first time a combination of dependencies are used in a workspace. If not, a cached Docker image is used. The Docker image built for the training jobs are stored in an Azure Container Registry that is attached to your workspace. You can get the name of this ACR using
workspace.get_details()
. If there was a image build step, you can take a look at the logs to understand the steps involved to build the final Docker image.
Currently Azure ML supports both cuda9 and cuda10 base images. The major dependencies installed in the base images are:
Dependencies | IntelMPI CPU | OpenMPI CPU | IntelMPI GPU | OpenMPI GPU |
---|---|---|---|---|
miniconda | ==4.5.11 | ==4.5.11 | ==4.5.11 | ==4.5.11 |
mpi | intelmpi==2018.3.222 | openmpi==3.1.2 | intelmpi==2018.3.222 | openmpi==3.1.2 |
cuda | - | - | 9.0/10.0 | 9.0/10.0/10.1 |
cudnn | - | - | 7.4/7.5 | 7.4/7.5 |
nccl | - | - | 2.4 | 2.4 |
git | 2.7.4 | 2.7.4 | 2.7.4 | 2.7.4 |
The CPU images are built from ubuntu16.04. The GPU images for cuda9 are built from nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04. The GPU images for cuda10 are built from nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04.
All images in this repository are published to Microsoft Container Registry(MCR). Information about these images are also published to Docker Hub.
You can pull these images from MCR using the following command.
- cpu image example:
docker pull mcr.microsoft.com\azureml\base:intelmpi2018.3-ubuntu16.04
- gpu image example:
docker pull mcr.microsoft.com\azureml\base-gpu:intelmpi2018.3-ubuntu16.04
If you observe the naming convention, image name and image tag information can be identified from the folder names in this repository.
GPU images pulled from MCR can only be used with Azure Services. Take a look at LICENSE.txt file inside the docker container for more information. GPU images are built from nvidia images. For NVIDIA CUDA and CUDNN take a look at the ThirdPartyNotices.txt file inside the docker container for more information about NVIDIA’s license terms
Currently Azure ML is using IntelMPI CPU as the default base image for training on a CPU compute target and IntelMPI GPU as the default base image for training on a GPU compute target. If you want to override the default image with another image from MCR or any publicly available image, you can do so by specifying an image in custom_docker_image
parameter in the Azure ML Estimators. You can specify a custom_docker_image
parameter in both Generic Estimators and any DNN Estimators provided by Azure ML.
Each image is associated with one or more tags. Below is the list of images and associated tags.
- IntelMPI CPU - Ubuntu 16.04
- base:intelmpi2018.3-ubuntu16.04
- base:latest
- OpenMPI CPU - Ubuntu 16.04
- base:openmpi3.1.2-ubuntu16.04
- OpenMPI CPU - Ubuntu 18.04
- base:openmpi3.1.2-ubuntu18.04
- IntelMPI GPU - cuda9 - Ubuntu 16.04
- base-gpu:intelmpi2018.3-cuda9.0-cudnn7-ubuntu16.04
- OpenMPI GPU - cuda9 - Ubuntu 16.04
- base-gpu:openmpi3.1.2-cuda9.0-cudnn7-ubuntu16.04
- IntelMPI GPU - cuda10.0 - Ubuntu 16.04
- base-gpu:intelmpi2018.3-cuda10.0-cudnn7-ubuntu16.04
- base-gpu:latest
- OpenMPI GPU - cuda10.0 - Ubuntu 16.04
- base-gpu:openmpi3.1.2-cuda10.0-cudnn7-ubuntu16.04
- OpenMPI GPU - cuda10.0 - Ubuntu 18.04
- base-gpu:openmpi3.1.2-cuda10.0-cudnn7-ubuntu18.04
- OpenMPI GPU - cuda10.1 - Ubuntu 18.04
- base-gpu:openmpi3.1.2-cuda10.1-cudnn7-ubuntu18.04
You can use your own custom Docker container to submit a training job in Azure ML. Below are the steps to build your own container and use it in Azure ML training.
- Install Azure ML SDK and setup environment
- Quickstarts, end-to-end tutorials, and how-tos on the official documentation site for Azure Machine Learning service.
- Python SDK reference
Use Docker images in this repository to understand how Docker images are created for use in Azure ML.
docker build -f dockerfile -t image_name:tag path
Documentation for docker build
Use docker run
to test the image locally
You can use images from Docker Hub and Azure Container Registry in Azure ML. Push your image to the repository of your choice.
If needed by one of the training jobs you submitted in Azure ML, an Azure Container Registry is created on your behalf and attached to your Azure ML workspace. You can get this information using workspace.get_details()
. If it was created, you can also get credentials of this ACR from Azure portal and reuse the same ACR to push your images.
Use image_registry_details
and custom_docker_image
parameters in Azure ML Estimators to use your own Dockerfile.
If your image is in a public repository, here is an example on how to use the image for training. If your image is in a private repository use image_registry_details
parameter to specify the credentials for your repository.
Visit following repositories to see the projects contributed by Azure ML users:
- Fine tune natural language processing models using Azure Machine Learning service
- Fashion MNIST with Azure ML SDK
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.