/chaos-ssm-documents

Collection of AWS SSM Documents to perform Chaos Engineering experiments

Primary LanguagePythonMIT LicenseMIT

Chaos Injection for AWS resources using Amazon SSM Run Command and Automation

Issues Maintenance Twitter

Collection of SSM Documents.

These documents let you perform chaos engineering experiments on resources (applications, network, and infrastructure) in the AWS Cloud.

SSM Automation documents:

To use SSM Automation, check the link

  • Support for (randomly) stopping EC2 instances via API
  • Support for (randomly) stopping EC2 instances via AWS Lambda
  • Support for (randomly) terminating EC2 instances via API
  • Support for detaching EBS volumes from EC2 instances via API (ec2, ebs)
  • Support for rebooting RDS instance with proper tags via API
  • Support for CPU stress scenario via Run Command

Upload an SSM Automation document:

aws ssm create-document --name "StopRandomInstances-API" --content file://stop-random-instance-api.yml --document-type "Automation" --document-format YAML

SSM Run Command documents:

To use SSM Run Command, please check this link

Support Canceling & Rollback (10s max)

  • Support for latency injection using latency-stress.yml
  • Support for latency with delta stress using latency-delta-stress.yml
  • Support for CPU burn using cpu-stress.yml
  • Support for IO stress using io-stress.yml
  • Support for memory stress using memory-stress.yml
  • Support for network stress using network-corruption-stress.yml
  • Support for packet Loss stress using network-loss-stress.yml
  • Support for killing a process by name using kill-process.yml
  • Support for diskspace stress using diskspace-stress.yml

Experimental

  • Support for configurable blackhole stress using blackhole-stress.yml
  • Support for blackhole S3 stress using blackhole-s3-stress.yml
  • Support for blackhole DynamoDB stress using blackhole-dynamo-stress.yml
  • Support for blackhole EC2 stress using blackhole-ec2-stress.yml
  • Support for blackhole DNS stress using blackhole-dns-stress.yml
  • Support for latency injection to a particular AWS service latency-service-stress.yml

Prerequisites

Upload one document at a time

cd chaos-ssm-documents/automation

aws ssm create-document --content file://cpu-stress.yml --name "cpu-stress" --document-type "Command" --document-format YAML

Upload all of the SSM Documents to the AWS region of your choice

cd chaos-ssm-documents/run-command

./upload-document.sh -r eu-west-1 (or other region of your choice)

Upload all of the SSM Documents using CloudFormation

cd chaos-ssm-documents/

run-command/create-cfn.sh run-command/ | tee cfn-chaos-ssm.yml

aws cloudformation create-stack --stack-name ChaosSsm --template-body file://cfn-chaos-ssm.yml

Specify AWS region using AWS CLI --region argument.

Once deployed, the stack cannot be updated. Remove existing stack and re-deploy to apply changes.

SOME WORDS OF CAUTION BEFORE YOU START BREAKING THINGS:

  • To begin with, DO NOT use these chaos injection commands in production blindly!!
  • Always review the SSM documents and the commands in them.
  • Make sure your first chaos injections are done in a test environment and on test instances where no real and paying customer can be affected.
  • Test, test, and test more. Remember that chaos engineering is about breaking things in a controlled environment and through well-planned experiments to build confidence in your application — and you own tools — to withstand turbulent conditions.

One-click Deploy via CloudFormation

Region Launch Stack
US East (N. Virginia) us-east-1 Launch Stack
US East (Ohio) us-east-2 Launch Stack
US West (N. California) us-west-1 Launch Stack
US West (Oregon) us-west-2 Launch Stack
Canada (Central) ca-central-1 Launch Stack
Africa (Cape Town) af-south-1 Launch Stack
Asia Pacific (Hong Kong) ap-east-1 Launch Stack
Asia Pacific (Mumbai) ap-south-1 Launch Stack
Asia Pacific (Seoul) ap-northeast-2 Launch Stack
Asia Pacific (Singapore) ap-southeast-1 Launch Stack
Asia Pacific (Sydney) ap-southeast-2 Launch Stack
Asia Pacific (Tokyo) ap-northeast-1 Launch Stack
Europe (Frankfurt) eu-central-1 Launch Stack
Europe (Ireland) eu-west-1 Launch Stack
Europe (London) eu-west-2 Launch Stack
Europe (Paris) eu-west-3 Launch Stack
Europe (Stockholm) eu-north-1 Launch Stack
Middle East (Bahrain) me-south-1 Launch Stack
South America (São Paulo) sa-east-1 Launch Stack