kudobuilder/kudo

Ability to re-run the same "upgrade" plan if it fails

dboyleitrs opened this issue · 3 comments

What would you like to be added:
When we release a new version of our software, we create custom upgrade plans that can add new Kube objects, pull latest images and create/modify kafka topics and database tables etc.
We tell end users to get our latest KUDO package and run a specific kudo upgrade command.
We had a case recently when an upgrade plan failed and quit ~25% in, due to a Job timing out. We manually corrected what the Job was trying to do. Now we want to rerun the upgrade plan to complete the upgrade.
When we reissue the same command we get: operator version already installed

Why is this needed:
We don’t want to tell customers to run a full uninstall and install, nor should they edit the version number in the operator.yaml file. Is there a way to force the same version of the upgrade plan to re-run?

Hmmm, it might be possible to simply retrigger the plan with kudo plan trigger --instance xyz --name upgrade

#1764 asks about a similar issue. I couldn't re-run the same plan in our case because a step was in an ERROR state rather than FATAL_ERROR.

@ANeumann82
Re-triggering the upgrade and deploy plans seems to work when errors occur during either plan. This will work for us. Possibly something to document if not already. Thanks.