/terraform-aws-mwaa

Terraform module to setup Managed Workflows with Apache Airflow. (Airflow as managed service by AWS)

Primary LanguageHCLMIT LicenseMIT

AWS MWAA Terraform Module

Terraform module which creates AWS MWAA resources and connects them together.

How to

Use this code to create a basic MWAA environment (using all default parameters, see Inputs):

module "airflow" {
  source = "idealo/mwaa/aws"
  version = "x.x.x"
  
  account_id = "12345679"
  environment_name = "MyEnvironment"
  internet_gateway_id = "ig-12345"
  private_subnet_cidrs = ["10.0.1.0/24","10.0.2.0/24"] # depending on your vpc ip range
  public_subnet_cidrs = ["10.0.3.0/24","10.0.4.0/24"] # depending on your vpc ip range
  region = "us-west-1"
  source_bucket_arn = "arn:aws:s3:::MyMwaaBucket"
  vpc_id = "vpc-12345"
}

Add permissions to the Airflow execution role

To give additional permissions to your airflow executions role (e.g. elasticmapreduce:CreateJobFlow to start an EMR cluster), create a Policy document containing the permissions you need:

data aws_iam_policy_document "additional_execution_policy_doc" {
  statement {
    effect = "Allow"
    actions = [
      "<Your permissions>"
    ]
    resources = [
      "<YourResource>"]
  }
}

and pass the document json to the module:

module "airflow" {
  ...
  additional_execution_role_policy_document_json = data.aws_iam_policy_document.additional_execution_policy_doc.json
  ...
}

Add custom plugins

Simply upload the plugins.zip to s3 and pass the relative path inside the MWAA bucket to the plugins_s3_path parameter. If you zip and upload it via terraform, this would look like this:

module "airflow" {
  ...
  plugins_s3_path = aws_s3_bucket_object.your_plugin.key
  ...
}

Use your own networking config

If you set create_networking_config = false no subnets, eip, NAT gateway and route tables will be created. Be aware that you still need the networking resources to get your environment running, follow the official documentation to create them properly.

S3 Bucket configuration

MWAA needs a S3 bucket to store the DAG files. Here is a minimal configuration for this S3 bucket:

resource "aws_s3_bucket" "mwaa" {
  bucket = ""
  versioning {
    # required: https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-s3-bucket.html
    enabled = true
  }
}
resource "aws_s3_bucket_public_access_block" "mwaa" {
  # required: https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-s3-bucket.html
  bucket                  = aws_s3_bucket.mwaa.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Requirements

Name Version
terraform >=1.0.0
aws ~> 3.0

Providers

Name Version
aws 3.75.1

Modules

No modules.

Resources

Name Type
aws_eip.this resource
aws_iam_role.this resource
aws_iam_role_policy.this resource
aws_mwaa_environment.this resource
aws_nat_gateway.this resource
aws_route_table.private resource
aws_route_table.public resource
aws_route_table_association.private resource
aws_route_table_association.public resource
aws_security_group.this resource
aws_subnet.private resource
aws_subnet.public resource
aws_iam_policy_document.assume data source
aws_iam_policy_document.base data source
aws_iam_policy_document.this data source

Inputs

Name Description Type Default Required
account_id Account ID of the account in which MWAA will be started string n/a yes
additional_execution_role_policy_document_json Additional permissions to attach to the base mwaa execution role string "{}" no
airflow_configuration_options additional configuration to overwrite airflows standard config map(string) {} no
airflow_version Airflow version to be used string "2.0.2" no
create_networking_config true if networking resources (subnets, eip, NAT gateway and route table) should be created. bool true no
dag_processing_logs_enabled n/a bool true no
dag_processing_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no
dag_s3_path Relative path of the dags folder within the source bucket string "dags/" no
environment_class n/a string "mw1.small" no
environment_name Name of the MWAA environment string n/a yes
internet_gateway_id ID of the internet gateway to the VPC any n/a yes
kms_key_arn KMS CMK ARN to use by MWAA for data encryption. MUST reference the same KMS key as used by S3 bucket specified by source_bucket_arn, if the bucket uses KMS. If not specified, the default AWS owned key for MWAA will be used for backward compatibility with version 1.0.1 of this module. string null no
max_workers n/a string "10" no
min_workers n/a string "1" no
plugins_s3_object_version n/a any null no
plugins_s3_path relative path of the plugins.zip within the source bucket string null no
private_subnet_cidrs CIDR blocks for the private subnets MWAA uses. Must be at least 2 if create_networking_config=true list(string) [] no
private_subnet_ids Subnet Ids of the existing private subnets that should be used if create_networking_config=false list(string) [] no
public_subnet_cidrs CIDR blocks for the public subnets MWAA uses. Must be at least 2 if create_networking_config=true list(string) [] no
region AWS Region where the environment and its resources will be created string n/a yes
requirements_s3_object_version n/a any null no
requirements_s3_path relative path of the requirements.txt (incl. filename) within the source bucket string null no
scheduler_logs_enabled n/a bool true no
scheduler_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no
source_bucket_arn ARN of the bucket in which DAGs, Plugin and Requirements are put string n/a yes
tags n/a map(string) {} no
task_logs_enabled n/a bool true no
task_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "INFO" no
vpc_id VPC id of the VPC in which the environments resources are created any n/a yes
webserver_access_mode Default: PRIVATE_ONLY string null no
webserver_logs_enabled n/a bool true no
webserver_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no
weekly_maintenance_window_start The day and time of the week in Coordinated Universal Time (UTC) 24-hour standard time to start weekly maintenance updates of your environment in the following format: DAY:HH:MM. For example: TUE:03:30. You can specify a start time in 30 minute increments only string "MON:01:00" no
worker_logs_enabled n/a bool true no
worker_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no

Outputs

Name Description
mwaa_arn n/a
mwaa_nat_gateway_public_ips List of the ips of the nat gateways created by this module.