This repository provides sample codes, which enable you to learn how to use auto-ml image classification, or object detection under Azure ML(AML) environment.
Target users
- You want to classify your photos or find objects from your photos with your customized deep-learning models.
- Please thihk about using Custom Vision for more simple development first.
- You don't want to customize the algorithms for image analysis so much.
- This repository aims at the second-best strategy for simplicity1, and auto machine learning technology provided by Microsoft is mainly used.
- You want to obtain the inferred results with the deep-learning models at batch.
- Please find some contents in references, if you're interested in real-time inference.
Disclaimer
- This repository aims at minimum system development with some references. Major contents are quoted from them, and please check them if you're interested in more.
- This repository was actually confirmed with some sample images as of July in 2022. Please regard it as your guideline in developping your application.
- Azure subscription, and its AML workspace
- Image files to be classified
- You can find 1 image file for testing inferring pipeline.
This repository is divided into training and inferring pipeline, and you can find that both environments are the same with respect to the AML environment perspective, i.e. both pipelines use the same compute_target
, environment
etc in AML.
So, you can easily merge them by sorting out implementation of Input/Output, if you prefer.
- Prepare Azure subscription, and AML workspace. You may find the steps here.
- Decide which image analysis will be satisfied with your demand between image classification or object detection.2
image classification
is divided as two tasks:multi-class
andmulti-label
.multi-class
: We can select only one class for each image, and some class must be selected. ex.) Morning, Noon, Evening, Nightmulti-label
: We can extract plural labels for each image, and none of the labels can be selected in some cases. ex.) Picture with dogs, cats and whales, but it doesn't contain any animals there.
- Start data labelling with your image files under the instruction
- Export the labed dataset into Dataset in AML. It will be used in training afterwards.
- Prepare
config.ini
under/common
directory with the instruction
- Once completing the prep in c.2, please populate pipelines for training deep learning model with Auto-ML image classification with supported-model-algorithms. You may find the steps here
- You use AML pipeline as batch execution like deep learning training or inference with this repository. In order to do it, you need
train.py
orinference.py
, which will be embedded in the pipelines.
- You use AML pipeline as batch execution like deep learning training or inference with this repository. In order to do it, you need
- As a preparation, you need to use AML workspace, and use two kinds of authentication
az
cli3 in 00. provisioning. Please check the site, if necessary.- You can find
az login
oraz login --use-device-code
with your preference.
- You can find
- Managed identity in 10. AML-pipeline_train and 20. AML_pipeline_inference
-
As usual authentication concept, you need three steps:
populate managed ID
,give access right to the populated ID
, andretrieve AML workspace with the ID
-
Populate managed ID
:-
In the sample impelementation, you set up as an argument
identity_type
in the methodAmlCompute.provisioning_configuration
:compute_config = AmlCompute.provisioning_configuration( vm_size=vm_size, idle_seconds_before_scaledown=600, min_nodes=0, max_nodes=4, location=vm_location, identity_type=managed_id, ## Require `SystemAssigned` for System assigned managed ID here )
By setting as above, you can use
managed identity
to retrieve AML workspace in executing actual batch pipelines in training of deep learning. Please see this page. You may make sure the populated managed ID in red-rectangle as follows:
-
-
Give access rights to the populated ID
- After generating the identity, you need to assign the appropriate rights like
READ
orWRITE
(IAM) in Azure AD likeEnterprise Application
setting. This site can help your understanding.
- After generating the identity, you need to assign the appropriate rights like
-
Retrieve AML workspace with the ID
- You can retrieve AML workspace as follows in train.py and inference.py:
from azureml.core.authentication import MsiAuthentication ## Authentication with managed identity msi_auth = MsiAuthentication() ## Retrieve Azure ML workspace ws = Workspace(subscription_id=subscription_id, resource_group=resource_group, workspace_name=workspace_name, auth=msi_auth)
- You can retrieve AML workspace as follows in train.py and inference.py:
-
- GPU instance in 10. AML-pipeline_train, and 20. AML-pipeline-inferrence
- With GPU-instance in training with deep-learning model, you need specific VM series like
NC-6
instead ofNV-6
.4compute_config = AmlCompute.provisioning_configuration( vm_size=vm_size, # Specify `NC-` series as computer cluster here idle_seconds_before_scaledown=600, min_nodes=0, max_nodes=4, location=vm_location, # Make sure the location prepares the `vm_size` identity_type=managed_id, )
- With GPU-instance in training with deep-learning model, you need specific VM series like
-
You need to prepare python environment in executing the whole pipelines, and major functions to be delopped are as follows:
- Ingest image files labelled by AML labelling tool
- train deep-learning model with those files under GPU-cluster, and fine-tune automatically
- Inferr with given image files and generated deep-learning models
-
In order to achieve under unified environment with
automl
in AML, this is a candidate for python environment setting5. You can change by adding more python libraries with your preferences.6aml_run_config.environment.python.conda_dependencies = CondaDependencies.create( python_version='3.7' ,conda_packages=['pandas' ,'scikit-learn' ,'numpy==1.20.1' ,'pycocotools==2.0.2' ] ,pip_packages=['azureml-sdk' ,'azureml-automl-core' ,'azureml-automl-dnn-vision==1.43.0' ] ,pin_sdk_version=False)
- Typical use cases for image classification with AutoML in Azure
- These use cases have similar ways for training/inferencing. Especially, inferencing is implemented as real-time manner:
- If you're interested in batch-inferencing, please refer this use case, where it doesn't have explicit method to "predict" with given image data. By contrast, we have explicit way to predict.
- Introduction for AutoML for images
Footnotes
-
IF you're interested in more customized algorithms, please visit https://arxiv.org/list/cs.CV/recent ↩
-
This repository doesn't align with image segmentation. ↩
-
command line interface ↩
-
Please make sure the situation here. Indeed, you can choose
NC
-series in specific region. ↩ -
as of July 2022 ↩
-
You can find
pandas
,scikit-learn
, which are not used in this repository but are basic libraries to develop more functions. Please add more, if you need. ↩