This readme contains a tutorial for attendees of the University of Kentucky MSP Azure Machine Learning Workshop. To register, visit this link: https://bit.ly/AzureMLUKY
Azure for Students gives K-12 and higher education students across the globe access to hundreds of Azure resources, including those needed for Azure Machine Learning Studio.
Follow the link: ---------- to download the "train.csv" dataset. This is the dataset which we will use to train and test the model. Select "New" then select Dataset and follow the on-screen instructions to upload the downloaded dataset from your computer. Then under the "Select Datasets" experiment item, drag your "train.csv" tab into the workspace.
-
Create a new experiment
-
Add "select columns in dataset" -> "Exclude: PassengerId, Name, Ticket, Cabin"
-
Add "clean missing data" -> Cleaning mode: Remove entire row
-
Add "Edit metadata" -> Pclass, Sex, Survived, Embarked -> Categorical: Make categorical
-
Add "Edit metadata" -> Survived -> Fields: label
-
Add "Split Data" -> Fraction of rows = 0.7 -> Stratified split: True, on "Survived"
-
Add 2 "Train model" -> select column: Survived
-
Add two-class neural network and two-class SVM, wire to train model
-
Add 2 "Score model"
-
Add "evaluate model"
-
Run and visualize output -> show which model is more accurate
-
Set up web service -> Predictive web service [Recommended]
-
Run predictive experiment
-
Deploy web service
-
Next to "Request/Response" , click "Test"
-
Submit a random sample of data
-
Look at the last part of the JSON that's returned: integer 0=died, 1=survived, decimal is probability
- Visit aka.ms/AzureForStudents
- Click Activate Now and sign in with your LinkBlue credentials
- Enter your email address and phone number when prompted - you may have to verify one of them
- Open the azure portal
- Click the menu on the left-hand side, go to resource groups
- Add a new resource group, we'll call it "MSPAprilML"
- Review+Create, then Create
- Now, in portal search bar, go to "Machine Learning"
- Give it a name, choose your resource group you just made, choose pricing "ENTERPRISE" Note: You won't be billed - this draws from your Azure for Students credit
- Review + Create, then Create [ ~ 3 minutes ]
- Go to https://bit.ly/TitanicDataML
- From Excel online, go to File -> Save As -> Download a Copy : Let's look at the CSV and see what data we have
- Go back to Azure portal (portal.azure.com), search for Machine Learning and open that panel
- Open your ML workspace resource
- Click "Launch Now" inside the New Machine Learning Studio dialog box
- You may have to sign back in
- Manage Compute
- Training Clusters -> New -> VM Size: STANDARD_D11_V2 -> Min number: 1 -> Max number: 2 -> CREATE
- Inference Clusters -> New -> Central US -> DevTest -> Create
- Designer -> Prebuilt -> give it a name
- First, we need to upload our dataset. From the left sidebar, go to "Dataset" this is a GUI that allows us to build interactive ML. Create Dataset -> From local files. Give it a name Leave tabular Leave datastore option default, browse to downloaded file and upload Column headers -> choose "use headers from first files" Next all the way through
- Back to designer, open previous pipeline
- Select compute target -> choose what you created earlier
- Drop down datasets, drag in in titanic
- Drag in "select column in dataset", wire titanic output to input of this module, then edit "select columns" for the following: ALL EXCEPT:
- PassengerId, Name, Ticket, Cabin
- Submit, select compute target, create new, predefined (give it a name), submit
- Submit, create new experiment, give name, submit On the backend, Azure is dynamically assigning a cluster of virtual machines to run your experiment. So, this may take a LOT of time.
- Add "clean missing data" -> Select columns: edit columns (By Name: Add all) -> Cleaning mode: Remove entire row, submit
- Add "Edit metadata" -> Pclass, Sex, Survived, Embarked -> Categorical: Make categorical :: SUBMIT
- Add "Edit metadata" -> Survived -> Fields: label :: SUBMIT
- Add "split data" -> stratified split: true, on "Survived" -> 0.7 :: SUBMIT
- Add 2 "Train Model" blocks -> Edit label column: survived
- Add "two-class decision forest" and "2-class SVM", wire to train model blocks
- Add 2 "Score Model" blocks and wire
- Add an "Evaluate Model" block and wire :: SUBMIT
- Click on "Evaluate Model" -> Output -> We can visualize and see evaluation results
- Click on higher scored "train model" -> create real-time inference pipeline
- Once in real-time inference pipeline, delete Evaluate Model (since we don't have two models to compare) :: SUBMIT
- Once run is finished, DEPLOY
- Choose compute target, click deploy
- Once done, "view live endpoint" -> Test
- Enter some data, see results!