The main objective of this project is to build a machine learning model using Azure Container Services.We have been provided with the banking dataset. The main steps of the project are:-
1) Authentication
2) Automated ML Experiment
3) Deploy the best model
4) Enable logging
5) Swagger Documentation
6) Consume model endpoints
7) Create and publish a pipeline
9) Documentation
An architectual diagram of the project and introduction of each step.
- We have to first register the dataset from the local files.
- We have to build a compute instance of type DS12_V2 for running the AutoML Run.
- Maximum number of nodes are 5 and min number of nodes are 5.
- We have to run an AutoML using the same registered Dataset.
- We have to mention the same compute instance which we build earlier.
- After running the AutoML we need to collect the best model from various diffrent models.
- Here we got voting ensemble model which chooses voting model to choose the best of several runs. The base model is XGBOOST with Maxabs scaling and accuracy of 91% .
- Once we Have the best model its time to deploy the model. We can use azure Kubernetes service or azure container instane for the deployment.
- We need to choose authenticate method during the deployment method. Once deployment is succeded an endpoint will be created with status showing as healthy in workspace
- Once the model is deployed we need to enable the logs setting the appinsights = True in the Experiment logging section by adding the experiment name.
- Once we have enabled the logging we should see the status in application insights saying the failed requests, timed out requests etc.
-
We can consume this endpoint using REST API or by running Azure ML python SDK's.
-
Swagger is one of the API tetsing platforms available .
-
Once the model is deployed we get a Swagger JSon file from the endpoint which needs to be downloaded and placed in the folder containing swagger files serve.py and swagger.sh.
-
After that we need to launch a local web server using serve script and lauch swagger using docker container by running swagger.sh
We can schedule the pipelines using schdeule recurrence parameter reducing the manual efforts.
Check out our CONTRIBUTING GUIDELINES
See project in action HERE🖼️
- Collecting more data can definitely help in improving accuracy.
- We can try testing the batch data in a schedule and see the performance.