gravitational/teleport-plugins

teleport-plugin-event-handler upgrade fails if the new pod deployed to different node

TeleLos opened this issue · 1 comments

Expected behavior:
When Upgrading the Teleport-plugin-event-handler we should be able to handle the condition where the upgraded pod is deployed on another node.

Current behavior:
With the default rolling update strategy in place, If the new pod is deployed to a different node you can experience the following error condition due to the old pod not releasing the persistent volume. Our helm charts implement the default rolling update strategy.

Warning FailedAttachVolume 11m attachdetach-controller Multi-Attach error for volume "pvc-<pvc-guid>" Volume is already used by pod(s) teleport-plugin-event-handler-<pod-guid>

The workaround is to modify the rolling update strategy from default to the following.

strategy:
    rollingUpdate:
        maxSurge: 0
        maxUnavailable: 1

Bug details:

  • Teleport version v15.X

Hey @TeleLos - thanks for this. An alternative would be to use

  strategy:
    type: Recreate

instead of rollingUpdate wouldn't it? It shouldn't matter that the pods drop the PVC as its using a pull mechanism to get the events and not a push one so there should be no loss of events/data for the plugin handler