RAPIDS is a suite of open-source libraries that bring GPU acceleration to data science pipelines. Users building cloud-based hyperparameter optimization experiments can take advantage of this acceleration throughout their workloads to build models faster, cheaper, and more easily on the cloud platform of their choice.
This repository provides example notebooks and "getting started" code samples to help you integrate RAPIDS with the hyperparameter optimization services from Azure ML, AWS Sagemaker, Google Cloud, and Databricks. The directory for each cloud contains a step-by-step guide to launch an example hyperparameter optimization job.
Each example job will use RAPIDS cuDF to load and preprocess 20 million rows of airline arrival and departure data and build a model to predict whether or not a flight will arrive on time. It demonstrates both cuML Random Forests and GPU-accelerated XGBoost modeling.
Amazon SageMaker Step-by-step.
From the root cloud-ml-examples directory:
docker build --tag cloud_examples_unified:latest --file ./common/docker/Dockerfile.training.unified ./
In addition to public cloud HPO options, the respository also includes "BYOC" sample notebooks that can be run on the public cloud or private infrastructure of your choice. These leverage Ray Tune or Dask-ML for distributed infrastructure, while demonstrating the same airline classifier HPO workload.