Sample project for ECS GPU Inference API
Table of Contents
- Deploy VPC stack
- Deploy ECS GPU cluster stack
- Deploy IAM Role stack
- Deploy ECS Service stack
- Scaling Test
- GPU usage test
- Execute a command using ECS Exec
Prerequisites
npm install -g aws-cdk@2.33.0
# install packages in the root folder
npm install
cdk bootstrap
Use the cdk
command-line toolkit to interact with your project:
cdk deploy
: deploys your app into an AWS accountcdk synth
: synthesizes an AWS CloudFormation template for your appcdk diff
: compares your app with the deployed stackcdk watch
: deployment every time a file change is detected
CDK Stack
Stack | Time | |
---|---|---|
1 | VPC | 3m |
2 | ECS EC2 cluster | 5m |
3 | IAM roles | 1m |
4 | ECS Service and ALB | 10m |
Total | 19m |
Docker image size: 3.1GB
Steps
Use the deploy-all.sh file if you want to deploy all stacks without prompt at a time.
Step 1: VPC
The VPC ID will be saved into the SSM Parameter Store to refer from other stacks.
Parameter Name : /cdk-ecs-gpu-ec2/vpc-id
Use the -c vpcId
context parameter to use the existing VPC.
cd vpc
cdk deploy
Step 2: ECS GPU cluster
cd ../ecs-ec2-cluster
cdk deploy
# or define your VPC id with context parameter
cdk deploy -c vpcId=<vpc-id>
SSM parameter:
- /cdk-ecs-gpu-ec2/vpc-id
Cluster Name: ecs-ec2-cluster/lib/cluster-config.ts
ecs-ec2-cluster/lib/ec2ecs-cluster-stack.ts
Step 3: IAM Role
Create the ECS Task Execution role and default Task Role.
- AmazonECS
GPU
TaskExecutionRole - ECS
GPU
DefaultTaskRole including a policy for ECS Exec
cd ../iam-role
cdk deploy
ecs-iam-role/lib/ecs-gpu-iam-role-stack.ts
Step 4: ECS Service
cd ../ecs-restapi-service
cdk deploy
SSM parameters:
- /cdk-ecs-gpu-ec2/vpc-id
- /cdk-ecs-gpu-ec2/cluster-capacityprovider-name
- /cdk-ecs-gpu-ec2/cluster-securitygroup-id
- /cdk-ecs-gpu-ec2/task-execution-role-arn
- /cdk-ecs-gpu-ec2/default-task-role-arn
ecs-restapi-service/lib/ecs-restapi-service-stack.ts
IMPORTANT
If the ECS cluster was re-created, you HAVE to deploy ecs-restapi-service
stack after cdk.context.json files deletion with the below because old SSM parameter values exist in cdk.context.json
.
find . -name "cdk.context.json" -exec rm -f {} \;
Step 5: Scaling Test
It taks arround x minutes until attached to ALB.
aws ecs update-service --cluster gpu-ec2-local --service gpu-restapi --desired-count 3
Step 6: GPU usage test
cd test
TEST_URL=$(aws cloudformation describe-stacks --stack-name ecs-gpu-service-restapi-local --query "Stacks[0].Outputs[?OutputKey=='TestURL'].OutputValue" --output text)
echo $TEST_URL
sed -e "s|<url>|${TEST_URL}|g" gpu-api-bzt-template.yaml > gpu-api-bzt.yaml
cat gpu-api-bzt.yaml
bzt gpu-api-bzt.yaml
Step 7: Execute the gpustat command using ECS Exec
Install the Session Manager plugin for the AWS CLI:
aws ecs list-tasks --cluster gpu-ec2-local --service-name gpu-restapi
{
"taskArns": [
"arn:aws:ecs:us-east-1:123456789:task/gpu-ec2-local/0a244ff8b8654b3abaaed0880b2b78f1",
"arn:aws:ecs:us-east-1:123456789:task/gpu-ec2-local/ac3d5a4e7273460a80aa18264e4a8f5e"
]
}
TASK_ID=$(aws ecs list-tasks --cluster gpu-ec2-local --service-name gpu-restapi | jq '.taskArns[0]' | cut -d '/' -f3 | cut -d '"' -f1)
aws ecs execute-command --cluster gpu-ec2-local --task $TASK_ID --container gpu-restapi-container --interactive --command "/bin/sh"
Connect to an ECS Task and run the gpustat
command:
gpustat
Step 7: Cleanup
Structure
├── build.gradle
├── package.json
├── ssm-prefix.ts
├── tsconfig.json
├── vpc
│ ├── bin
│ │ └── index.ts
│ ├── cdk.json
│ └── lib
│ └── vpc-stack.ts
├── ecs-ec2-cluster
│ ├── bin
│ │ └── index.ts
│ ├── cdk.json
│ ├── lib
│ │ ├── cluster-config.ts
│ │ └── ec2ecs-cluster-stack.ts
│ └── settings.yaml
├── ecs-iam-role
│ ├── bin
│ │ └── index.ts
│ ├── cdk.json
│ └── lib
│ └── ecs-gpu-iam-role-stack.ts
├── ecs-restapi-service
│ ├── bin
│ │ └── index.ts
│ ├── cdk.json
│ ├── lib
│ │ └── ecs-restapi-service-stack.ts
├── app
│ ├── Dockerfile
│ ├── README.md
│ ├── build.sh
│ ├── flask_api.py
│ ├── gunicorn.config.py
│ └── requirements.txt
Reference
Docs
-
Dynamic Port Mapping - The host and awsvpc network modes do not support dynamic host port mapping.
-
ECS Exec for debugging