/ecs-lifecycle-hook

ECS cluster safe instance terminations

Primary LanguagePythonApache License 2.0Apache-2.0

ecs-lifecycle-hook

This is run as a Lambda function agaist ECS clusters backed by ASGs. It was inspired by the AWS blog post How to Automate Container Instance Draining in Amazon ECS.

Whenever a lifecycle-enabled ASG terminates a node (for example, scale-in), the terminating instance is placed into a wait state and a notification is sent on a configured SNS topic. This allows this lambda–which is triggered on SNS notifications–to read this event and perform a few steps.

First, if the instance is a member of an ECS cluster, the node is set to draining. Within the limits of the deploymentConfiguration in the service definition, new tasks are launched on active nodes. On draining nodes, tasks will eventually stop. This process can take some time for services associated with load balancers.

Once the tasks on the draining node have been stopped (or until the ASG lifecycle timeout occurs), an API call to the autoscaling group is sent to proceed in terminating the instance.

This lambda function accomplishes this process with the following process:

  1. The SNS notfication for a given instance is checked for ECS cluster membership
  2. The node is set to drain
  3. Check for daemon tasks (which are tasks started by the instance–perhaps at boot–and not by a service) a. If only daemon tasks remain, they are stopped b. If there are tasks started by services remaining, daemon tasks are not stopped
  4. Verify that all tasks have been stopped on a given instance a. If tasks remain, the notification is re-published to the SNS topic to be re-checked in the future, and the process starts over
  5. Notify the autoscaling group to proceed with terminating the instance

Building

Requires:

  • Python 3.6
  • pipenv

This will build a zip that can be uploaded to S3. From the project root:

pip install -r <(pipenv lock -r) -t dist/
cp -a src/*.py dist/
cd dist && zip -r bundle.zip .
aws s3 cp bundle.zip s3://<your-bucket>/bundle.zip

Deploying

See the provided CloudFormation stack file: stack.yaml.

Development

Installing Dependencies

After you have pipenv installed:

pipenv sync --dev

Tests

The following are required to pass for a successful build:

  • flake8 src/
  • flake8 --ignore=E501 tests/
  • PYTHONPATH=src/ pytest --verbose -s