I referred to this notebook.
The contents of this repository make it easy to create ML Workflow without using a notebook.
As an example, run the MNIST preprocessing, training, and model evaluation processes in StepFunctions.
- Create an S3 bucket for use with Workflow.
- Use container image in SageMaker Processing. Create an ECR Repository to store this image.
To execute it, you need two roles, the SageMaker execution role and the Step Functions Workflow execution role.
- SageMaker execution role:
Create a role that has access to the created S3 and ECR. Also touch the policies of SageMaker and StepFunctions to this role. - Workflow execution role :
Create a Workflow execution role for Step Functions according to this explanation.
$ docker build -t your-ecr-repo-name -f docker/Dockerfile .
# If you want to use your own image instead of the built-in image
$ docker build -t your-train-ecr-repo-name -f docker/train.Dockerfile .
$ docker build -t your-lambda-ecr-repo-name -f docker/lambda.Dockerfile .
Push the built image to the ECR repository.
This image is used to run SageMaker Processing.
Download the MNIST data from here.
There are 4 gz files.
- t10k-images-idx3-ubyte.gz
- t10k-labels-idx1-ubyte.gz
- train-images-idx3-ubyte.gz
- train-labels-idx1-ubyte.gz
Make this a single zip file named input.zip
.
$ zip input.zip t10k-images-idx3-ubyte.gz t10k-labels-idx1-ubyte.gz train-images-idx3-ubyte.gz train-labels-idx1-ubyte.gz
Upload the zip file to S3.
$ tar zcvf source.tar.gz train.py
Create a lambda function using the container image for lambda that you uploaded earlier.
Assign a policy that can access SageMaker and S3 to the execution role of the lambda function.
Edit config.yml
.
aws:
role:
sagemaker_execution: Set the ARN for the SageMaker execution role you created in the previous step.
workflow_execution: Set the ARN for the Workflow execution role you created in the previous step.
bucket: Your S3 Bucket name
ecr_repository_uri: Set the URI of the Docker image you uploaded in the previous step.
input_data_s3_uri: Set the URI of `input.zip` that you uploaded to S3 in the previous step.
stepfunctions:
workflow:
name:
sagemaker:
experiment:
name:
bucket_name: Bucket where you want to store the CSV of the Experiment
key: Key when saving CSV of Experiment
processing:
preprocess:
job_name_prefix:
instance_count:
instance_type: execution instance type such as ml.m5.xlarge
max_runtime_in_seconds:
evaluation:
job_name_prefix:
instance_count:
instance_type:
max_runtime_in_seconds:
training:
job_name_prefix:
instance_count:
instance_type:
use_spot_instances:
max_run:
max_wait:
hyperparameters:
learning_rate: '0.001'
epochs: '5'
image_uri: If you want to use your own image instead of the built-in image
lambda:
function_name: The name of the lambda function you just created
See the official documentation for details on options such as instance type.
# Don't forget to set the environment in docker-compose.yml.
$ docker-compose run --rm app bash
$ python workflow.py
Visualize the results of the Experiment with QuickSight. Since the CSV of Experiment is stored in the S3 bucket specified in config, specify it in the dataset of QuickSight. The following is a sample manifest.json when specifying in the QuickSight dataset.
{
"fileLocations": [
{
"URIs": [
"s3://**********/*********.csv"
]
},
{
"URIPrefixes": [
]
}
],
"globalUploadSettings": {
"format": "CSV",
"delimiter": ",",
"textqualifier": "'",
"containsHeader": "true"
}
}
You can specify the path of the config.yml
file.
$ python workflow.py -c your-config-file-path