Optimizing an ML Pipeline in Azure

Overview

This project is part of the Udacity Azure ML Nanodegree. In this project, we build and optimize an Azure ML pipeline using the Python SDK and a provided Scikit-learn model. This model is then compared to an Azure AutoML run.

Summary

This dataset contains data from a marketing study that tries to determine, based on the different characteristics of the people contacted, if they will subscribe to a term deposit.

Source: UCI Bank Marketing

In 1-2 sentences, explain the solution: e.g. "The best performing model was a ..." The best performing model was found after studying how a classical approach using scikit-learn regression can be compared versus the different options available in Azure ML. To be able to compare the different approachs, the metric used to compare was the accuracy of the model, with the following results:

For the scikit-learn model, the accuracy was 90.7%
For the Hyperparameter tuning, where the Regularization strength was allowed to be randomly selected from a uniform distribution of (0.01 to 0.99), the accuracy was 91.3%, or slightly better than for the scikit (for a regularization strength of 0.266).
Finally, by allowing Azure to use AutoML to select the best model and run, we were able to increase the accuracy to 91.7%. In this case, the model VotingEnsemble was selected by Azure as the best performing.

The max_item hyperparameter keeps the number of iterations under control to avoid iterating the model in case the model does not converge.

Scikit-learn Pipeline

Explain the pipeline architecture, including data, hyperparameter tuning, and classification algorithm. The Scikit model was a logistic regression based on a series of columns from the original data. The selected columns that were considered relevant for the prediction were chosen in the script.

The contact job, marital status, it she had defaulted at any time financially, was a home owner and have a loan, as well as the education level were considered. Also, if several contacts were made for the subscription and when those contacts were made, were considered.

What are the benefits of the parameter sampler you chose? The regularization strength of the model was the parameter that was parametrized. This allowed the model to be more or less flexible to the regularization of each parameter, letting to find the best model.

What are the benefits of the early stopping policy you chose? For the stopping policy, the Bandit policy was selected that checks if the new model is too far or too close in its solution to the previous ones. It was checked every 2 runs, and 10% was selected as the slack for the comparison.

AutoML

In 1-2 sentences, describe the model and hyperparameters generated by AutoML. Different models were generated and selected automatically by AutoML. The final, best performing model was a VotingEnsemble that takes several models and uses a voting process where each of them selects the best prediction and votes for it based on its selected parameters. The explanation of the model shows how the duration of the call is the most important aggregated parameter that determines if finally the subscription is achieved. After that, the number of employees and the employment variation rate are the next most important factors. Notice how different from the ones hand selected by the first approach these are.

Pipeline comparison

Compare the two models and their performance. What are the differences in accuracy? In architecture? If there was a difference, why do you think there was one? Running the hyperparameter model allowed to get a greater accuracy by automating the process of selecting the best hyperparameter, but it leaves to the analyst the process of selecting the better features and selecting the model they believe it is best to solve this problem. At the same time, although the running time of each model is short, the preparation to run the model is extensive and requires more work. The AutoML takes more time to run, but that extra time can be compensated by liberating the analyst from the tasks of selecting the features and the model. Allow for a wider range of models to be checked, including ensembles that usually provide better results (as it takes the best of each model) and are much shorter to deploy and prepare, as it only require to configure them and allow AzureML to run them.

Future work

What are some areas of improvement for future experiments? Why might these improvements help the model? From this point, the model can be improved by monitoring and adding new data. Another option is use the AutoML results to pinpoint the models that provide the best results and try to run them separately analyzing the data if it can be improved in any way, for that specific model.