Optimizing an ML Pipeline in Azure

Overview

This project is part of the Udacity Azure ML Nanodegree. In this project, we build and optimize an Azure ML pipeline using the Python SDK and a provided Scikit-learn model. This model is then compared to an Azure AutoML run.

Summary

In 1-2 sentences, explain the problem statement: e.g "This dataset contains data about... we seek to predict..."

In 1-2 sentences, explain the solution: e.g. "The best performing model was a ..."

Scikit-learn Pipeline

Explain the pipeline architecture, including data, hyperparameter tuning, and classification algorithm.

What are the benefits of the parameter sampler you chose?

What are the benefits of the early stopping policy you chose?

AutoML

In 1-2 sentences, describe the model and hyperparameters generated by AutoML.

Pipeline comparison

Compare the two models and their performance. What are the differences in accuracy? In architecture? If there was a difference, why do you think there was one?

Future work

What are some areas of improvement for future experiments? Why might these improvements help the model?

Proof of cluster clean up

If you did not delete your compute cluster in the code, please complete this section. Otherwise, delete this section. Image of cluster marked for deletion