/kubernetes-graceful-termination

Example go program for gracefully terminating a Pod running in kubernetes

Primary LanguageGoMIT LicenseMIT

kubernetes-graceful-termination

Example demo program for gracefully terminating a Pod running in kubernetes

Slides: https://docs.google.com/presentation/d/1qYsDH_6Jg2PE3K-LOi64DrqCl5FKC4Hq34R6hxClHK8/edit?usp=sharing

Pre-reqs

  • make
  • go
  • docker
  • kind
  • kubectl

Quick start

# build test
$ make

# deploy to knd cluster
$ make kind

Demos

Setup

Start your kind cluster and deploy the example graceful-terminator Pod:

$ make kind

For each demo you should run the following steps:

# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

In another window run your example command

Note: We use --wait=false in each demo so you're returned to your prompt promptly

Graceful Shutdown (configured timeout 60s)

Try a normal graceful termination workflow

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# delete the pod using the configured grace period
$ kubectl delete pod graceful-terminator --wait=false

# Window 2
# try issuing another delete request
$ kubectl delete pod graceful-terminator --wait=false --grace-period=3
$ kubectl delete pod graceful-terminator --wait=false --grace-period=3
$ kubectl delete pod graceful-terminator --wait=false --grace-period=3

# It doesn't even work with --force!
$ kubectl delete pod graceful-terminator --wait=false --grace-period=0 --force

# Pods only hear the first delete request!

Notice:

  • Pod shuts down in 60s total
  • Pod drains its active connections at a rate relative to the 60s deadline
  • Pods only respect the first delete request!
  • Pod exits successfully
  • Kubectl Events include a Killing Event for 60s -- same as --grace-period

Graceful Shutdown (override timeout)

Let's speed up (or slowdown) the termination by specifying a shorter deadline than our configuration

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# delete the pod overriding the grace period
$ kubectl delete pod graceful-terminator --wait=false --grace-period=10

Notice:

  • Pod shuts down in 10s total (the specified grace period)
  • Pod drains its active connections at a rate relative to the 10s deadline
  • Pod exits successfully (if the max drain rate allows for it)
  • Pod exits unsuccessfully (if we can't drain connections fast enough)
  • Kubectl Events include a Killing Event for 10s -- same as --grace-period

Graceful Shutdown (asap)

Let's delete the Pod as fast as possible while still allowing kubectl to go through the motions

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# delete the pod ASAP
$ kubectl delete pod graceful-terminator --wait=false --grace-period=1

Notice:

  • Pod shuts down in in 1s total (the fastest possible graceful termination)
  • Pod exits unsuccessfully (if we can't drain connections fast enough)
  • Kubectl Events include a Killing Event for 1s -- same as --grace-period

Non-Graceful Shutdown (force)

Now let's see why --force is not a good idea

Note: This demo has 3 windows. Window 3 is a critctl window inside the controlplane Node in the kind cluster.

Setup crictl window:

# on localhost
$ docker exec -it kind-control-plane bash

# inside kind-control-plane
# watch all Pods named graceful-terminator
% crictl statsp --label io.kubernetes.pod.name=graceful-terminator --watch

Demo:

# Window 1
# re-deploy the graceful-terminator Pod and start tailing logs
$ make watch

# Logs

# Window 2
# forcefully delete the pod
$ kubectl delete pod graceful-terminator --wait=false --force

# Notice: kubectl doesn't see this pod anymore
$ kubectl get pods

# Notice: Window 3: The pod is still running and consuming resources!

# Keep trying... it doesn't do anything...
$ kubectl delete pod graceful-terminator --wait=false --force

# We can even launch more to consumer more resources!
# Window 1
$ make watch
# Window 2
$ kubectl delete pod graceful-terminator --wait=false --force
# Window 3
# Now we have 2 zombie pods!

# Window 1
$ make watch
# Window 2
$ kubectl delete pod graceful-terminator --wait=false --force
# Window 3
# Now we have 3 zombie pods!

# etc...

Notice:

  • kubectl and kubernetes api server immediately return and tell you the pod is deleted successfully
  • kubectl's success message is hogwash
  • Container process still hears SIGTERM and tries to terminate gracefully in the background (even though we can't see the Pod)
  • Pod shuts down in in 1s total (the fastest possible graceful termination)
  • Pod exits unsuccessfully (if we can't drain connections fast enough)
  • Kubectl Events include a Killing Event for 1s -- same as --grace-period