IBM/varnish-operator

Exposing container lifecycle and pod terminationGracePeriodSeconds configuration in VarnishCluster CRD

Closed this issue · 2 comments

Hello, great project! I have a use-case where I'd like to add container[].lifecycle.preStop configurations into the varnish statefulset that's generated by the VarnishCluster CRD, but unfortunately there doesn't seem to be a way to do that currently.

My use case is as follows:

  • configured VCL to be a self-routing cluster with consistent hashing
  • want 0 downtime when pods move during upgrades (or any other maintenance)
  • In conjunction with setting a deregistration timeout on the loadbalancer side of things, increasing the terminationGracePeriodSeconds config, and including unused "shutdown" vcl configurations in my vcl ConfigMap, the following lifecycle config on the varnish container will allow for graceful pod removal with no errors or dropped traffic:
lifecycle:
  preStop:
    exec:
      command:
      - /bin/bash
      - -c
      - varnishadm vcl.load shutdown /etc/varnish/entrypoint_shutdown.vcl
        && varnishadm vcl.use shutdown && sleep 60

Worth mentioning that the entrypoint_shutdown.vcl is identical to my main entrypoint.vcl except the heartbeat probe between varnish cluster pods will return a 404 to mark it sick to all members to allow traffic to drain and not be sent to that pod.

cin commented

@lucasreed this sounds like it'd be a solid addition! I like the idea of using increasing the terminationGracePeriodSeconds as that should deal with any failure/hang with the varnishadm commands (and allow the statefulset to continue rolling if upgrading).

Does it make sense to make the duration of the sleep configurable? This would give users some flexibility with how client applications close connections. We may also want to set it low for testing.

Unfortunately, this repo isn't being maintained at the moment. I've been maintaining a fork here with the hopes of merging back here periodically. Feel free to submit a PR for this feature, and we'll try to get merged. Thanks!

I went with a different project after all (https://github.com/mittwald/kube-httpcache)