/ocr-sqs-s3-ecs-cloudformation

Primary LanguageShellApache License 2.0Apache-2.0

CloudFormation Templates for an example OCR service with Amazon ECS, SQS, and S3

Overview

This repository contains CloudFormation templates to deploy an example OCR service on AWS using ECS, SQS and S3

The templates are based on the AWS reference architecture, see https://github.com/aws-samples/ecs-refarch-cloudformation

Process

process-overview

Images of scanned or photographed text can be uploaded in the web application. The files are transfered to a S3 bucket, which has a bucket notification configured that send as message to a SQS queue. OCR worker containers poll the queue, recognize text and store the results in another S3 bucket.

Infrastructure

infrastructure-overview

The repository consists of a set of nested templates that deploy the following:

Both, the web app and OCR workers scaled in and out based on the CPU utilization. This could be improved by scaling the OCR workers based on the SQS queue depth using a CloudWatch alarm.

Template details

The templates below are included in this repository and reference architecture:

Template Description
master.yaml This is the master template - deploy it to CloudFormation and it includes all of the others automatically.
infrastructure/vpc.yaml This template deploys a VPC with a pair of public and private subnets spread across two Availability Zones. It deploys an Internet gateway, with a default route on the public subnets. It deploys a pair of NAT gateways (one in each zone), and default routes for them in the private subnets.
infrastructure/security-groups.yaml This template contains the security groups required by the entire stack. They are created in a separate nested template, so that they can be referenced by all of the other nested templates.
infrastructure/sqs.yaml This template contains the SQS queue used to store the events for newly uploaded images that should be processed. It also contains the VPC endpoint for SQS
infrastructure/s3.yaml This template contains the S3 buckets used to store input and output files. It also contains the VPC endpoint for S3
infrastructure/load-balancers.yaml This template deploys an ALB to the public subnets, which exposes the various ECS services. It is created in in a separate nested template, so that it can be referenced by all of the other nested templates and so that the various ECS services can register with it.
infrastructure/ecs-cluster.yaml This template deploys an ECS cluster to the private subnets using an Auto Scaling group and installs the AWS SSM agent with related policy requirements.
infrastructure/lifecyclehook.yaml This template deploys a Lambda Function and Auto Scaling Lifecycle Hook to drain Tasks from your Container Instances when an Instance is selected for Termination in your Auto Scaling Group.
services/webapp-service/service.yaml This is the example web application, see aws-ocr-s3-frontend
services/ocr-worker-service/service.yaml This is the example ocr worker service, see nodejs-sqs-ocr

Network

This set of templates deploys the following network design:

Item CIDR Range Usable IPs Description
VPC 10.180.0.0/16 65,536 The whole range used for the VPC and all subnets
Public Subnet 10.180.8.0/21 2,041 The public subnet in the first Availability Zone
Public Subnet 10.180.16.0/21 2,041 The public subnet in the second Availability Zone
Private Subnet 10.180.24.0/21 2,041 The private subnet in the first Availability Zone
Private Subnet 10.180.32.0/21 2,041 The private subnet in the second Availability Zone