/aws-ecs-airflow

Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks

Primary LanguageHCLMIT LicenseMIT

airflow-ecs

Setup to run Airflow in AWS ECS containers

Requirements

Local

  • Docker

AWS

  • AWS IAM User for the infrastructure deployment, with admin permissions
  • awscli, intall running pip install awscli
  • terraform >= 0.13
  • setup your IAM User credentials inside ~/.aws/credentials
  • setup these env variables in your .zshrc or .bashrc, or in your the terminal session that you are going to use
    export AWS_ACCOUNT=your_account_id
    export AWS_DEFAULT_REGION=us-east-1 # it's the default region that needs to be setup also in infrastructure/config.tf
    

Local Development

  • Generate a Fernet Key:

    pip install cryptography
    export AIRFLOW_FERNET_KEY=$(python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
    

    More about that here

  • Start Airflow locally simply running:

    docker-compose up --build
    

If everything runs correctly you can reach Airflow navigating to localhost:8080. The current setup is based on Celery Workers. You can monitor how many workers are currently active using Flower, visiting localhost:5555

Deploy Airflow on AWS ECS

To run Airflow in AWS we will use ECS (Elastic Container Service).

Deploy Infrastructure using Terraform

Run the following commands:

make infra-init
make infra-plan
make infra-apply

or alternatively

cd infrastructure
terraform get
terraform init -upgrade;
terraform plan
terraform apply

By default the infrastructure is deployed in us-east-1.

When the infrastructure is provisioned (the RDS metadata DB will take a while) check the if the ECR repository is created then run:

bash scripts/push_to_ecr.sh airflow-dev

By default the repo name created with terraform is airflow-dev Without this command the ECS services will fail to fetch the latest image from ECR

Deploy new Airflow application

To deploy an update version of Airflow you need to push a new container image to ECR. You can simply doing that running:

./scripts/deploy.sh airflow-dev

The deployment script will take care of:

  • push a new ECR image to your repository
  • re-deploy the new ECS services with the updated image

TODO

  • Create Private Subnets
  • Move ECS containers to Private Subnets
  • Use ECS private Links for Private Subnets
  • Improve ECS Task and Service Role