/demo-rosa-sagemaker

How to use ROSA with AWS sagemaker

Primary LanguageShell

Fingerprint Prediction on Red Hat OpenShift Container Platform

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

The Project

Goal for this demo is to demonstrate the flexible integration of tools to train a model with SageMaker, serve the model on OpenShift using the NVIDIA Triton Inference engine and interact with the model using Gradio.

Why do this?

Overall, Data science needs flexibility and IT needs standardization.

  • From a development perspective, SageMaker is a robust toolkit that data scientists are familiar with from preparing data and training models.
  • From an operational perspective, Red Hat OpenShift is a best-in-class enterprise Kubernetes container platform. It provides a hybrid cloud solution from private data centers to multiple cloud vendors. It makes it easier to build, operate, and scale globally, and on demand, through a familiar management interface.
  • Partner and open source software easily extend Red Hat OpenShift for other AI/ML needs:
    • AWS Controller for Kubernetes Operators: IAM, EC2, S3, SageMaker
    • NVIDIA Triton Inference server
    • and the Gradio UI software library

The Scenario

The scenario we use was to train a model that could predict suspect attributes from an unknown fingerprint. For example, in the data center or in the field, this model could help down-select possible suspects given an unseen fingerprint. Since we only had public data, the predictions are basic, but the possibilities are what we intend to inspire.

The demo

This demo covers several topics across the lifecycle for extending Red Hat OpenShift to perform common data science tasks from data ingestion to inference monitoring.

Training Notebook Inference UI
sagemaker notebook gradion fingerprint user interface

See the Getting Started to get started.

Built With

Getting Started

If you have the demo installed, start in the AWS Console SageMaker Notebooks. If not, see the Prerequisites.

(back to top)

Prerequisites

  • Red Hat OpenShift Cluster 4.10+
  • Cluster admin permissions
  • AWS Web Console access
  • oc cli installed locally
  • python3.x installed locally
  • aws-cli installed locally

(back to top)

Installation

# From the browser,
# 1. Access the web console and login

# 2. copy your terminal login command and token from the web console.

# From your terminal,
# 3. log into the cluster using oc login (you may need to add --insecure-skip-tls-verify=true)
oc login

# 4. clone in the demo-rosa-sagemaker repo to provision the demo
git clone https://github.com/redhat-na-ssa/demo-rosa-sagemaker.git

# 5. change in to the demo-rosa-sagemaker directory
cd demo-rosa-sagemaker

# 6. run the scripts/bootstrap.sh 
./scripts/bootstrap.sh

# optional
# source ./scripts/bootstrap.sh and run commands individually, i.e.
setup_demo
delete_demo

(back to top)

Tutorial

# From the AWS Web Console
# search for `S3` in the searchbar and navigate to the bucket where our data (sagemaker-fingerprint-data-<UUID>'

# search for `SageMaker` in the searchbar and go to the page

# navigate down the menu to `Notebook > Notebook Instances`

# (optional) instance the instance and lifecycle configuration

# launch the JupyterLab instance

# in jupyter, go to `notebooks` directory, open and run each notebook in order

# IMPORTANT You should only have one notebook open at a time. After you run each notebook, Restart the Kernel and Clear the output, then Close the notebook

# open, run and close the following notebooks in order:

  # 00_data_exploration.ipynb
  
  # 01_data_processing.ipynb
  
  # 02_model_train_hand.ipynb

# search for `S3` in the searchbar and navigate to the bucket where our data (sagemaker-fingerprint-data-<UUID>'

# move into the folder with our model

# From the OpenShift Web Console

# navigate to the `Operators > Installed Operators` and notice the AWS operators

# navigate to the Project `models`

# go to the `Networking > Routes` menu item

# open the URL for the `gradio-client` instance and interact with the model

# under the `Routes` menu, open the URL for the `grafana` instance

# to obtain the credentials, from the OCP Web Console, click on `Workloads > Secrets`

# open the `grafana-admin-credentials` instance, copy and paste the credentials into the grafana dashboard

# go to the four-block waffle menu and click `models`

# go to `Triton Model Server` and view the dashboard metrics

# From the AWS Web Console SageMaker Notebook
# open, run and close the following notebooks in order:

  # 03_model_tune_hand.ipynb

  # 04_model_batch_inference.ipynb

# Review the reports and the prediction.csv files saved to your scratch folder

# From the AWS Web Console
# search for `S3` in the searchbar and navigate to the bucket where our data (sagemaker-fingerprint-data-<UUID>'

# move into the folder with our model and review the second version of our tuned model

# From the OpenShift Web Console
# click on `Workloads > Deployments'

# open the gradio-client' deplyoment instance and click on the `Environment` menu item

# notice the MODEL_REPOSITORY value that will pull the latest version of the model available for inference

# open the URL for the `gradio-client` instance and interact with the model (notice the "model_version": "2")

Intended Usage

Intended to be run on Red Hat OpenShift Container Platform on AWS (self-managed). Alternatively, Red Hat OpenShift on AWS (managed). Extend RHOCP with AWS capabilities.

(back to top)

Contributing

(back to top)

License

(back to top)

Contact

(back to top)

Acknowledgements