kubernetes-graceful-termination

Example demo program for gracefully terminating a Pod running in kubernetes

Slides: https://docs.google.com/presentation/d/1qYsDH_6Jg2PE3K-LOi64DrqCl5FKC4Hq34R6hxClHK8/edit?usp=sharing

Pre-reqs

make
go
docker
kind
kubectl

Quick start

# build test
$ make

# deploy to knd cluster
$ make kind

Demos

Setup

Start your kind cluster and deploy the example graceful-terminator Pod:

$ make kind

For each demo you should run the following steps:

# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

In another window run your example command

Note: We use --wait=false in each demo so you're returned to your prompt promptly

Graceful Shutdown (configured timeout 60s)

Try a normal graceful termination workflow

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# delete the pod using the configured grace period
$ kubectl delete pod graceful-terminator --wait=false

# Window 2
# try issuing another delete request
$ kubectl delete pod graceful-terminator --wait=false --grace-period=3
$ kubectl delete pod graceful-terminator --wait=false --grace-period=3
$ kubectl delete pod graceful-terminator --wait=false --grace-period=3

# It doesn't even work with --force!
$ kubectl delete pod graceful-terminator --wait=false --grace-period=0 --force

# Pods only hear the first delete request!

Notice:

Pod shuts down in 60s total
Pod drains its active connections at a rate relative to the 60s deadline
Pods only respect the first delete request!
Pod exits successfully
Kubectl Events include a Killing Event for 60s -- same as --grace-period

Graceful Shutdown (override timeout)

Let's speed up (or slowdown) the termination by specifying a shorter deadline than our configuration

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# delete the pod overriding the grace period
$ kubectl delete pod graceful-terminator --wait=false --grace-period=10

Notice:

Pod shuts down in 10s total (the specified grace period)
Pod drains its active connections at a rate relative to the 10s deadline
Pod exits successfully (if the max drain rate allows for it)
Pod exits unsuccessfully (if we can't drain connections fast enough)
Kubectl Events include a Killing Event for 10s -- same as --grace-period

Graceful Shutdown (asap)

Let's delete the Pod as fast as possible while still allowing kubectl to go through the motions

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# delete the pod ASAP
$ kubectl delete pod graceful-terminator --wait=false --grace-period=1

Notice:

Pod shuts down in in 1s total (the fastest possible graceful termination)
Pod exits unsuccessfully (if we can't drain connections fast enough)
Kubectl Events include a Killing Event for 1s -- same as --grace-period

Non-Graceful Shutdown (force)

Now let's see why --force is not a good idea

Note: This demo has 3 windows. Window 3 is a critctl window inside the controlplane Node in the kind cluster.

Setup crictl window:

# on localhost
$ docker exec -it kind-control-plane bash

# inside kind-control-plane
# watch all Pods named graceful-terminator
% crictl statsp --label io.kubernetes.pod.name=graceful-terminator --watch

Demo:

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# forcefully delete the pod
$ kubectl delete pod graceful-terminator --wait=false --force

# Notice: kubectl doesn't see this pod anymore
$ kubectl get pods

# Notice: Window 3: The pod is still running and consuming resources!

# Keep trying... it doesn't do anything...
$ kubectl delete pod graceful-terminator --wait=false --force

# We can even launch more to consumer more resources!
# Window 1
$ make watch
# Window 2
$ kubectl delete pod graceful-terminator --wait=false --force
# Window 3
# Now we have 2 zombie pods!

# Window 1
$ make watch
# Window 2
$ kubectl delete pod graceful-terminator --wait=false --force
# Window 3
# Now we have 3 zombie pods!

# etc...

Notice:

kubectl and kubernetes api server immediately return and tell you the pod is deleted successfully
kubectl's success message is hogwash
Container process still hears SIGTERM and tries to terminate gracefully in the background (even though we can't see the Pod)
Pod shuts down in in 1s total (the fastest possible graceful termination)
Pod exits unsuccessfully (if we can't drain connections fast enough)
Kubectl Events include a Killing Event for 1s -- same as --grace-period

hjkatz/kubernetes-graceful-termination

kubernetes-graceful-termination

Pre-reqs

Quick start

Demos

Setup

Graceful Shutdown (configured timeout 60s)

Graceful Shutdown (override timeout)

Graceful Shutdown (asap)

Non-Graceful Shutdown (force)