alphagov/gsp

cloudformation stacks unable to reconcile if in ROLLBACK_COMPLETE state

Closed this issue · 1 comments

What

If the service operator fails to create a stack, it can get into a state where it is unable to continue. The only course of action is to delete the kubernetes resource and create it again.

Potential improvements

  • When attempting to apply a stack template to a "dead" stack, the controller could first delete the stack before attempting to create it again. This may require ensuring that the controller does not retry after entering the ROLLBACK_COMPLETE state or we risk getting into a create/destory loop

I'm going to close this because by the sounds of it it might need some thought about what the most appropriate thing to do when this happens.

If it happens regularly hopefully the increased toil in order to handle it will help us prioritize this.