/sagemaker-custom-training-containers

Implementations of Amazon SageMaker-compatible custom containers for training.

Primary LanguageJupyter Notebook

Amazon SageMaker Custom Training containers

This folder contains skelethon implementations of Amazon SageMaker-compatible training containers.

The purpose of these examples is explaining how to build a custom container for training, with a relevant focus on the Amazon SageMaker Training Toolkit, a toolkit that facilitates the development of training containers for SageMaker, and enables dynamic loading of user scripts from Amazon S3, thus separating the execution environment (Docker container) from the script being executed. For additional info please see: https://github.com/aws/sagemaker-training-toolkit.

By purpose, no specific ML science is applied in this context, and code is simulating training dummy models.

Each example is structured as follows:

example
└───docker     # Dockerfile and dependencies
└───notebook   # Notebook with detailed walkthrough 
└───scripts    # Build scripts

Four examples are provided and listed below.

The bare minimum that is required for building a custom Docker container to run training in Amazon SageMaker. See additional details here: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html.

Basic training container diagram

A custom container where we install the Amazon SageMaker Training toolkit and enable the Script Mode execution through the training toolkit.

Script mode container diagram

Similar to the Script Mode Container example, but loading the user-provided training module from Amazon S3.

Script mode container 2 diagram

Similar to the Script Mode Container 2 example, but installing an additional module that allows to customize a ML/DL framework before executing the user-provided training module.

Framework container diagram