/cloud-ml-examples

A collection of Machine Learning examples to get started with deploying RAPIDS in the Cloud

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

 RAPIDS Cloud Machine Learning Services Integration

RAPIDS is a suite of open-source libraries that bring GPU acceleration to data science pipelines. Users building cloud-based machine learning experiments can take advantage of this acceleration throughout their workloads to build models faster, cheaper, and more easily on the cloud platform of their choice.

This repository provides example notebooks and "getting started" code samples to help you integrate RAPIDS with the hyperparameter optimization services from Azure ML, AWS Sagemaker, Google Cloud, and Databricks. The directory for each cloud contains a step-by-step guide to launch an example hyperparameter optimization job. Each example job will use RAPIDS cuDF to load and preprocess data and use cuML or XGBoost for GPU-accelerated model training. RAPIDS also integrates easily with MLflow to track and orchestrate experiments from any of these frameworks.

For large datasets, you can find example notebooks using Dask to load data and train models on multiple GPUs in the same instance or in a multi-node multi-GPU cluster.

Cloud / Framework HPO Example Multi-node multi-GPU Example
Microsoft Azure Azure ML HPO Multi-node multi-GPU cuML on Azure
Amazon Web Services (AWS) AWS SageMaker HPO
Google Cloud Platform (GCP) Google AI Platform HPO Multi-node multi-GPU XGBoost and cuML on Google Kubernetes Engine (GKE)
Dask Dask-ML HPO Multi-node multi-GPU XGBoost and cuML
Databricks Hyperopt and MLflow on Databricks
MLflow Hyperopt and MLflow on GKE
Optuna Dask-Optuna HPO
Optuna on Azure ML
Ray Tune Ray Tune HPO

Quick Start Using RAPIDS Cloud ML Container

The Cloud ML Docker Repository provides a ready to run Docker container with RAPIDS and libraries/SDKs for AWS SageMaker, Azure ML and Google AI Platfrom HPO examples.

Pull Docker Image:

docker pull rapidsai/rapidsai-cloud-ml:22.06-cuda11.5-base-ubuntu18.04-py3.8

Build Docker Image:

From the root cloud-ml-examples directory:

docker build --tag rapidsai-cloud-ml:latest --file ./common/docker/Dockerfile.training.unified ./

Bring Your Own Cloud (Dask and Ray)

In addition to public cloud HPO options, the respository also includes "BYOC" sample notebooks that can be run on the public cloud or private infrastructure of your choice, these leverage Ray Tune or Dask-ML for distributed infrastructure.

Check out the RAPIDS HPO webpage for video tutorials and blog posts.

Logo