/aro-odh-demo

Azure Red Hat OpenShift (ARO) demo using Open Data Hub (ODH) and Azure data services

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Overview

ARO demo using Open Data Hub and Azure data services - Azure Data Lake and Azure Blob Storage.

Prerequisites

Setup

Download the Iris dataset.

Create a storage account with Azure Data Lake.

Create a storage principal:

  • Make sure to assign the Storage Blob Data Contributor role to the service principal
  • Create a new application secret for authenticating the service principal
  • Copy down the client-id, tenant-id, and client-secret values (you will need this later)

View your account access key and copy down the storage account's connection string (you will need this later)

Download azcopy.

Upload the Iris dataset to Azure Data Lake

Replace with your tenant-id and storage account name

azcopy login --tenant-id=<tenant-id> 
azcopy make 'https://<storage-account-name>.dfs.core.windows.net/mycontainer'
azcopy copy iris.data 'https://<storage-account-name>.dfs.core.windows.net/mycontainer/sample/iris.data'

Configure anonymous access to storage container mycontainer.

Launch JupyterHub

echo $(oc get route jupyterhub -n odh --template='http://{{.spec.host}}')

Select the s2i-spark-minimal-notebook image and spawn the server. Leave the other settings as they are.

Upload the model_pipeline.ipynb notebook. Set the variables in the second cell where it says ### ENTER YOUR DETAILS ###.

TODO

  • Mount a secret with the env variables for the client, tenant, and client secret values
  • Add Kubeflow on Tekton pipeline
  • Add model validation and model update to the pipeline
  • Add Spark connection to Azure Data Lake