This project ensures that the Kubernetes control plane responds appropriately to events that can cause your EC2 instance to become unavailable, such as EC2 maintenance events and EC2 Spot interruptions. If not handled, your application code may not have enough time to stop gracefully, take longer to recover full availability, or accidentally schedule work to nodes that are going down. This handler will run a small pod on each host to perform monitoring and react accordingly. When we detect an instance is going down, we use the Kubernetes API to cordon the node to ensure no new work is scheduled there, then drain it, removing any existing work.
The termination handler watches the instance metadata service to determine when to make requests to the Kubernetes API to mark the node as non-schedulable. If the maintenance event is a reboot, we also apply a custom label to the node so when it restarts we remove the cordon.
You can run the termination handler on any Kubernetes cluster running on AWS, including self-managed clusters and those created with Amazon Elastic Kubernetes Service.
- Monitors EC2 Metadata for Scheduled Maintenance Events
- Monitors EC2 Metadata for Spot Instance Termination Notifications
- Helm installation and event configuration support
- Webhook feature to send shutdown or restart notification messages
- Unit & Integration Tests
The termination handler installs into your cluster a ServiceAccount, ClusterRole, ClusterRoleBinding, and a DaemonSet. All four of these Kubernetes constructs are required for the termination handler to run properly.
You can use kubectl to directly add all of the above resources with the default configuration into your cluster.
kubectl apply -f https://github.com/aws/aws-node-termination-handler/releases/download/v1.7.0/all-resources.yaml
For a full list of releases and associated artifacts see our releases page.
The easiest way to configure the various options of the termination handler is via helm. The chart for this project is hosted in the eks-charts repository.
To get started you need to add the eks-charts repo to helm
helm repo add eks https://aws.github.io/eks-charts
Once that is complete you can install the termination handler. We've provided some sample setup options below.
Zero Config:
helm upgrade --install aws-node-termination-handler \
--namespace kube-system \
eks/aws-node-termination-handler
Enabling Features:
helm upgrade --install aws-node-termination-handler \
--namespace kube-system \
--set enableSpotInterruptionDraining="true" \
--set enableScheduledEventDraining="false" \
eks/aws-node-termination-handler
Running Only On Specific Nodes:
helm upgrade --install aws-node-termination-handler \
--namespace kube-system \
--set nodeSelector.lifecycle=spot \
eks/aws-node-termination-handler
Webhook Configuration:
helm upgrade --install aws-node-termination-handler \
--namespace kube-system \
--set webhookURL=https://hooks.slack.com/services/YOUR/SLACK/URL \
eks/aws-node-termination-handler
Alternatively, pass Webhook URL as a Secret:
WEBHOOKURL_LITERAL="webhookurl=https://hooks.slack.com/services/YOUR/SLACK/URL"
kubectl create secret -n kube-system generic webhooksecret --from-literal=$WEBHOOKURL_LITERAL
helm upgrade --install aws-node-termination-handler \
--namespace kube-system \
--set webhookURLSecretName=webhooksecret \
eks/aws-node-termination-handler
For a full list of configuration options see our Helm readme.
To use the termination handler alongside Kiam requires some extra configuration on Kiam's end. By default Kiam will block all access to the metadata address, so you need to make sure it passes through the requests the termination handler relies on.
To add a whitelist configuration, use the following fields in the Kiam Helm chart values:
agent.whiteListRouteRegexp: '^\/latest\/meta-data\/(spot\/instance-action|events\/maintenance\/scheduled|instance-(id|type)|public-(hostname|ipv4)|local-(hostname|ipv4)|placement\/availability-zone)$'
Or just pass it as an argument to the kiam agents:
kiam agent --whitelist-route-regexp='^\/latest\/meta-data\/(spot\/instance-action|events\/maintenance\/scheduled|instance-(id|type)|public-(hostname|ipv4)|local-(hostname|ipv4)|placement\/availability-zone)$'
The termination handler relies on the following metadata endpoints to function properly:
/latest/meta-data/spot/instance-action
/latest/meta-data/events/maintenance/scheduled
/latest/meta-data/instance-id
/latest/meta-data/instance-type
/latest/meta-data/public-hostname
/latest/meta-data/public-ipv4
/latest/meta-data/local-hostname
/latest/meta-data/local-ipv4
/latest/meta-data/placement/availability-zone
For build instructions please consult BUILD.md.
- If you've run into a bug or have a new feature request, please open an issue.
- You can also chat with us in the Kubernetes Slack in the
#provider-aws
channel - Check out the open source Amazon EC2 Spot Instances Integrations Roadmap to see what we're working on and give us feedback!
Contributions are welcome! Please read our guidelines and our Code of Conduct
This project is licensed under the Apache-2.0 License.