Amazon ECR Helper for AWS HealthOmics

This is a simple serverless application that helps automate preparing containers for use with AWS HealthOmics Workflows that performs two key functions:

container-puller: Retrieves container images from public registries like (Amazon ECR Public, Quay.io, DockerHub) and stages them in Amazon ECR Private image repositories in your AWS account
container-builder: Builds ECR Private container images from source bundles staged in S3

Under the hood, it this application leverages AWS Step Functions, AWS CodeBuild, and Amazon ECR for much of the heavy lifting.

Requirements

The following software is required in your environment
- AWS CDK
- AWS CLI v2

Usage

Deploy the AWS CloudFormation stacks used by the application in each region you intend to run AWS HealthOmics Workflows using the following:

NOTE:
Ensure you are in the same directory where this file is located

# install package dependencies
npm install

# in your default region (specify profile if other than 'default')
cdk deploy --all --profile <aws-profile>

From here, you can proceed to either:

Retrieving public containers
Retrieving ECR private containers
Building containers

Retriving public containers

This is the most common case, and likely the only functionality of this CDK application you will use.

Create a container_pull_manifest.json file with contents like:

{
    "manifest": [
        "ubuntu:20.04",
      	"us.gcr.io/broad-gatk/gatk:4.4.0.0",
      	"ghcr.io/miniwdl-ext/miniwdl-aws:v0.9.0-1-g243f36f",
        "quay.io/biocontainers/bcftools:1.16--hfe4b78e_1",
        "public.ecr.aws/docker/library/python:3.9.16-bullseye",
        "quay/biocontainers/bwa-mem2:2.2.1--he513fc3_0",
      	"ecr-public/aws-genomics/google/deepvariant:1.4.0"
    ]
}

Execute the following to pull this list of container images into your ECR private registry:

aws stepfunctions start-execution \
    --state-machine-arn arn:aws:states:<aws-region>:<aws-account-id>:stateMachine:omx-container-puller \
    --input file://container_pull_manifest.json

The state-machine will pull source image uris into your private ECR registry according to the following rules:

Images are privatized into repositories with namespaces that correspond to their original source public registry - e.g. a full repository name is of the format <namespace>/<image-name> and a full private image uri will be of the form <aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com/<namespace>/<image-name>:<image-tag>
Images from Quay.io or ECR Public (public.ecr.aws) use ECR pull through caching
- An image from Quay.io will be pulled into a repository with the quay/ namespace
- An images from public.ecr.aws will be pulled into a repository with the 'ecr-public/` namespace
Images from other known public registries are pulled and pushed into a custom created ECR private repository using a CodeBuild project
- An image from Google Container Registry (gcr.io) will be pulled into a repository with the gcr/ namespace
- An image from Google Artifact Registry (pkg.dev) will be pulled into a repository with the gar/ namespace
- An image from Github Container Registry (ghcr.io) will be pulled into a repository with the ghcr/ namespace
- An image from Microsoft Container Registry (mcr.microsoft.com) will be pulled into a repository with the mcr/ namespace
- An image from DockerHub will be pulled into a repository with the dockerhub/ namespace
Images from other (less common) public registries are not supported at this time

When the omx-container-puller state machine completes, and all containers have been pulled into ECR Private successfully, you can can proceed to configuring and running your workflow.

Retrieving ECR Private containers

This process is only necessary if you need to retrieve container images from ECR Private registries, either in your AWS account or from other AWS accounts.

This process works alongside retrieving container images from public registries.

Create a config file at the root level of this application call app-config.json with contents like:

{
    "container_puller": {
        "source_aws_accounts": [
            "111122223333",
            "444455556666"
        ]
    }
}

The source_aws_accounts list specifies other AWS account ids that the application is allowed to pull ECR Private images from. By default, each account id will resolve to a corresponding ECR Private registry in the region that the CDK application is deployed to.

For the example above, if the CDK application was deployed to the us-east-1 AWS region, this would be:

111122223333.dkr.ecr.us-east-1.amazonaws.com
444455556666.dkr.ecr.us-east-1.amazonaws.com

To allow pulling containers across AWS regions, modify the config to be like:

{
    "container_puller": {
        "source_aws_accounts": [
            "111122223333",
            "444455556666"
        ],
        "allow_cross_region_pull": true
    }
}

Note: Cross-region container image pulls are typically slower and may incur additional cost.

(Re)deploy the application using:

cdk deploy --all

Once deployed you can start an AWS StepFunctions state machine execution as in Retrieving public containers, only now your container_pull_manifest.json can contain image URIs like:

{
    "manifest": [
        "111122223333.dkr.ecr.us-east-1.amazonaws.com/foo:1.1.1",
      	"444455556666.dkr.ecr.us-east-1.amazonaws.com/bar:2.2.2"
    ]
}

The state-machine will pull source image uris into your private ECR registry according to the same rules as described in Retrieving public containers. This configuration adds the following:

Images from other ECR Private registries are replicated in your ECR Private registry as <your-aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com/<source-ecr-image-repository-name>:<sourc-image-tag>

Building containers

Note that this process is only necessary if you wish to build containers from scratch rather than simply retreiving them from a public repository as described above.

Create a config file at the root level of this application called app-config.json with contents like:

{
    "container_builder": {
        "source_uris": [
            "s3://my-bucket-1",
            "s3://my-bucket-2/path/to/source"
        ]
    }
}

The source_uris list specifies S3 locations where source for container images (e.g. Dockerfiles and accompanying assets) have been staged. You must create these locations beforehand.

(Re)deploy the application using:

cdk deploy --all

Sync container image source to the location(s) you specified above. These can either be bare source:

aws s3 sync ./path/to/container-source/image-foo s3://my-bucket-1/container-source/image-foo

or zip bundles:

(cd ./path/to/container-source/image-bar && zip -r ../image-bar.zip .)
aws s3 cp ./path/to/container-source/image-bar.zip s3://my-bucket/container-source/image-bar.zip

Create a container_build_manifest.json with contents like the following:

{
    "manifest": [
        {
            "source_uri": "s3://my-bucket/container-source/image-foo/",
            "target_image": "foo:omics"
        },
        {
            "source_uri": "s3://my-bucket/container-source/image-bar.zip",
            "target_image": "bar:omics"
        }
    ]
}

Execute the following to build this list of container images and place them in your ECR private registry:

aws stepfunctions start-execution \
    --state-machine-arn arn:aws:states:<aws-region>:<aws-account-id>:stateMachine:omx-container-builder \
    --input file://container_build_manifest.json

The state-machine will retrieve the source bundles and build images using the available Dockerfile in each. Images are built and pushed to ECR private repositories that match the target_image values in the input manifest. Full private image uris will be of the form <aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com/<target_image>. If you need images to have specific namespaces you can include them in the target_image - e.g. mynamespace/foo:omics.

Configure and run workflow

After processing the manifests above, configure your workflow to use images from your ECR Private Registry:

// how this is defined depends on the workflow lanugage used, but is effectively something like:
// configure list of containers
ecr_registry = "aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com"
containers = {
    "task1": ecr_registry + "/dockerhub/ubuntu:20.04",
    "task2": ecr_registry + "/gcr/broad-gatk/gatk:4.4.0.0",
    "task3": ecr_registry + "/ghcr/miniwdl-ext/miniwdl-aws:v0.9.0-1-g243f36f",
    "task4": ecr_registry + "/quay/biocontainers/bcftools:1.16--hfe4b78e_1",
    "task5": ecr_registry + "/ecr-public/docker/library/python:3.9.16-bullseye",
    "task6": ecr_registry + "/quay/biocontainers/bwa-mem2:2.2.1--he513fc3_0",
    "task7": ecr_registry + "/ecr-public/aws-genomics/google/deepvariant:1.4.0",
    "task8": ecr_registry + "/foo:omics",
    "task9": ecr_registry + "/bar:omics"

}

// use containers[task_name] in task definitions
task task1 {
    runtime {
        container: containers['task1']
    }

    ...
}

Then create and run your workflow using AWS HealthOmics.

Costs

The Amazon ECR Helper for HealthOmics is a serverless application and does not incur costs when idle.

Any Amazon ECR container image repositories created have a storage cost - see Amazon ECR pricing for more details.

How it works

Retrieving container images

The container-puller stack automates "privatizing" container images - re-staging them from public registries to an ECR Private registry.

To do this, it first relies on ECR Pull-through caching which:

allows docker clients to pull an image URI that looks like it comes from ECR Private
creates a private image repository based on the public image that is being pulled
pulls the public image and caches it in the private repository

Second, an EventBridge rule is used to detect when an image repository is created.

Third, the EventBridge rule triggers a Lambda function that applies the required access policy to the ECR repository that was created. This policy looks like:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "omics workflow access",
            "Effect": "Allow",
            "Principal": {"Service": "omics.amazonaws.com"},
            "Action": [
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage",
                "ecr:BatchCheckLayerAvailability"
            ]
        }
    ]
}

Currently, AWS HealthOmics checks for the existence of ECR image repositories and specific image URIs before launching ECS tasks. This pre-check means you need to "prime" container images into ECR Private prior to a running a workflow that depends on them the first time, even if pull-through caching is enabled.

The priming process is automated by submitting a container image manifest to a StepFunctions state machine that calls a CodeBuild Project that retrieves container image URIs.

The mechanisms above are also generalized to support other public container registries in the following ways:

The ECR CreateRepository API is called when a corresponding repository does not exist. This is only used when retriving images from public registries that do not support pull-through caching.
AwsApiCall events that create ECR repositories with the tag Key=createdBy,Value=omx-ecr-helper will also trigger the Lambda Function.
The CodeBuild project is parameterized to do either pull-through only or pull and push actions

To save costs the workflow will only run the CodeBuild project if a requested image uri does not already have a corresponding private ECR image.

Building container images

The container-builder stack automates building container images using AWS Step Functions and AWS CodeBuild. An ECR private repository is created as needed for the image. It also utilizes the same EventBridge rule, trigger, and Lambda above to add required ECR repository policies.

Note: There are no capabilities at this time to check if the image source has changed such that it would result in an updated image relative to an existing one in ECR Private. Therefore, CodeBuild project builds in this stack will always execute when processing a manifest. As a result, if images already exist in ECR Private they will be overwritten.

Development

The cdk.json file tells the CDK Toolkit how to execute your app.

Useful commands

npm run build compile typescript to js
npm run watch watch for changes and compile
npm run test perform the jest unit tests
cdk deploy deploy this stack to your default AWS account/region
cdk diff compare deployed stack with current state
cdk synth emits the synthesized CloudFormation template

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

hmkim/amazon-ecr-helper-for-aws-healthomics