TwiN/k8s-ttl-controller

Automatically drain nodes if their TTL expires

TwiN opened this issue · 3 comments

TwiN commented

Describe the feature request

If a resource of type node has a TTL which expired, attempt to drain the node before deleting it.

Note that for the first implementation, the drain process can be non-graceful (i.e. it can use the equivalent of kubectl drain node --disable-eviction, which bypasses PDBs).

Why do you personally want this feature to be implemented?

To handle node deletion slighly more gracefully.
Hopefully, it will work alongside cluster-autoscaler to provide a slightly better experience than just deleting the node.

How long have you been using this project?

No response

Additional information

No response

instead of automatically draining can this be configuration based with the option to cordon the node and/or drain?

The motivation is that we have users on our clusters that have argo workflows that run on nodes that cannot be automatically pruned. We need to allow their workflows to complete before draining. In which case we can configure ttl to cordon them instead of draining, allow the workflows to finish, and the cluster auto scaler will take care of removing the node as it will meet the criteria for scaling down candidate.

+1 on this feature though!

TwiN commented

@dmarquez-splunk yep, that's an excellent idea!

Are you able to provide a rough estimate of when something like this can be supported?