/terraform-aws-rds-alarms

This Terraform module manages Cloudwatch Alarms for an RDS instance.

Primary LanguageHCLMIT LicenseMIT

Terraform Module for AWS RDS Cloudwatch Alarms

This Terraform module manages Cloudwatch Alarms for an RDS instance. It does NOT create or manage RDS, only Metric Alarms.

Requires:

  • AWS Provider
  • Terraform 0.12

Alarms Created

Alarms Always Created (default values can be overridden):

  • CPU Utilization above 90%
  • Disk queue depth above 64
  • Disk space less than 10 GB
  • EBS Volume burst balance less than 100
  • Freeable memory below 256 MB
  • Swap usage above 256 MB
  • Anomalous connection count

If the instance type is a T-Series instance type (automatically determind), the following alarms are also created:

  • CPU Credit Balance below 100

If the database engine is any of postgres type (configured with var.engine), then the following alarms are also created:

  • Maximum used transaction IDs over 1,000,000,000 [reference]

Estimated Operating Cost: $ 1.00 / month

  • $ 0.10 / month for Metric Alarms (7x)
  • $ 0.30 / month for Anomaly Alarm (1x)

Example

resource "aws_db_instance" "default" {
  allocated_storage    = 10
  storage_type         = "gp2"
  engine               = "mysql"
  engine_version       = "5.7"
  instance_class       = "db.t2.micro"
  identifier_prefix    = "rds-server-example"
  name                 = "my-db"
  username             = "foo"
  password             = "bar"
  parameter_group_name = "default.mysql5.7"
  apply_immediately    = "true"
  skip_final_snapshot  = "true"
}

resource "aws_sns_topic" "default" {
  name_prefix = "my-premade-topic"
}

module "aws-rds-alarms" {
  source         = "lorenzoaiello/rds-alarms/aws"
  version        = "x.y.z"
  db_instance_id = aws_db_instance.default.id
  actions_alarm  = [aws_sns_topic.default.arn]
  actions_ok     = [aws_sns_topic.default.arn]
}

This above can pair very nicely for example with an module which creates an SNS topic which sends your alerts into Slack. For example:

resource "aws_db_instance" "default" {
  allocated_storage    = 10
  storage_type         = "gp2"
  engine               = "mysql"
  engine_version       = "5.7"
  instance_class       = "db.t2.micro"
  identifier_prefix    = "rds-server-example"
  name                 = "my-db"
  username             = "foo"
  password             = "bar"
  parameter_group_name = "default.mysql5.7"
  apply_immediately    = "true"
  skip_final_snapshot  = "true"
}

module "notify_slack" {
  source  = "terraform-aws-modules/notify-slack/aws"
  version = "~> 4.0"

  sns_topic_name = "slack-topic"

  slack_webhook_url = "https://hooks.slack.com/services/AAA/BBB/CCC"
  slack_channel     = "aws-notification"
  slack_username    = "reporter"
}

module "aws-rds-alarms" {
  source         = "lorenzoaiello/rds-alarms/aws"
  version        = "x.y.z"
  db_instance_id = aws_db_instance.default.id
  actions_alarm  = [module.sns_to_slack.this_slack_topic_arn]
  actions_ok     = [module.sns_to_slack.this_slack_topic_arn]
}

Variables

Name Description Type Default Required
actions_alarm A list of actions to take when alarms are triggered. Will likely be an SNS topic for event distribution. list [] no
actions_ok A list of actions to take when alarms are cleared. Will likely be an SNS topic for event distribution. list [] no
anomaly_period The number of seconds that make each evaluation period for anomaly detection. string "600" no
anomaly_band_width The width of the anomaly band detection. Higher numbers means less sensitive string "2" no
db_instance_id RDS Instance ID string n/a yes
engine The RDS engine being used. Used for database engine specific alarms string "" no
evaluation_period The evaluation period over which to use when triggering alarms. string "5" no
prefix Alarm Name Prefix string "" no
statistic_period The number of seconds that make each statistic period. string "60" no
tags Tags to attach to each alarm map(string) {} no
db_instance_class The rds instance-class, e.g. db.t3.medium string yes
cpu_utilization_too_high_threshold Alarm threshold for the 'highCPUUtilization' alarm string "90" no
cpu_credit_balance_too_low_threshold Alarm threshold for the 'lowCPUCreditBalance' alarm string "100" no
disk_queue_depth_too_high_threshold Alarm threshold for the 'highDiskQueueDepth' alarm string "64" no
disk_free_storage_space_too_low_threshold Alarm threshold for the 'lowFreeStorageSpace' alarm (in bytes) string "10000000000" no
disk_burst_balance_too_low_threshold Alarm threshold for the 'lowEBSBurstBalance' alarm string "100" no
maximum_used_transaction_ids_too_high_threshold Alarm threshold for the 'maximumUsedTransactionIDs' alarm string "1000000000" no
memory_freeable_too_low_threshold Alarm threshold for the 'lowFreeableMemory' alarm (in bytes) string "256000000" no
memory_swap_usage_too_high_threshold Alarm threshold for the 'highSwapUsage' alarm (in bytes) string "256000000" no
create_high_cpu_alarm Whether or not to create the high cpu alarm bool true no
create_low_cpu_credit_alarm Whether or not to create the low cpu credit alarm bool true no
create_high_queue_depth_alarm Whether or not to create the high queue depth alarm bool true no
create_low_disk_space_alarm Whether or not to create the low disk space alarm bool true no
create_low_disk_burst_alarm Whether or not to create the low disk burst alarm bool true no
create_low_memory_alarm Whether or not to create the low memory free alarm bool true no
create_swap_alarm Whether or not to create the high swap usage alarm bool true no
create_anomaly_alarm Whether or not to create the fairly noisy anomaly alarm bool true no

Outputs

Name Description
alarm_connection_count_anomalous The CloudWatch Metric Alarm resource block for anomalous Connection Count
alarm_cpu_credit_balance_too_low The CloudWatch Metric Alarm resource block for low CPU Credit Balance
alarm_cpu_utilization_too_high The CloudWatch Metric Alarm resource block for high CPU Utilization
alarm_disk_burst_balance_too_low The CloudWatch Metric Alarm resource block for low Disk Burst Balance
alarm_disk_free_storage_space_too_low The CloudWatch Metric Alarm resource block for low Free Storage Space
alarm_disk_queue_depth_too_high The CloudWatch Metric Alarm resource block for high Disk Queue Depth
alarm_memory_freeable_too_low The CloudWatch Metric Alarm resource block for low Freeable Memory
alarm_memory_swap_usage_too_high The CloudWatch Metric Alarm resource block for high Memory Swap Usage
alarm_maximum_used_transaction_ids_too_high The CloudWatch Metric Alarm resource block for postgres' Transaction ID Wraparound