Whisper Processing AWS Stack

This is a full-stack processing pipeline for transcribing voice data using Whipser model.

It is built on top of AWS services, including S3, OpenSearch, Lambda, Step Functions, and Batch.

Development

We use AWS Chalice to develop lambdas.

The batch jobs are just plain-old Docker containers.

We use AWS CDK to manage and deploy the entire stack.

The step function definitions are in infrastructure/stacks/processing_stack.py. Please make sure to edit based on Step Functions documentation.

make deploy stack="whisper-processing-aws-ecr"

make build-lambdas

make deploy stack="whisper-processing-aws-stack"