GoogleCloudPlatform/elcarro-oracle-operator

Adding additional disks to Instance requires manual deletion of the StatefulSet

Opened this issue · 0 comments

Describe the bug
When adding additional disks to an instance and rolling out the change with helm upgrade, we have to delete the oracle instance statefulset during the helm upgrade, in order for the changes to be applied.

To Reproduce

  1. Deploy oracle instance
  2. Update the instance spec in the helm chart to add additional disk(s)
  3. Deploy the changes with helm upgrade
  4. Observe that the disks are not being created, because while the Instance is updated, the Operator can't patch the statefulset, as an immutable part of the spec has changed and recreation of the statefulset is required

Expected behavior
The operator should recreate the statefulset by itself to not require manual intervention.

Workaround
Deleting the statefulset manually after executing the helm upgrade resolves the issue, as the operator recreates the statefulset.
However, we can't automate this, as we are waiting for the helm upgrade to finish (with --wait) in our deployment pipeline. Deleting the statefulset before executing helm upgrade doesn't work, because the operator recreates it instantly, before helm can apply the changes.
We need "--wait" during the helm upgrade, because it allows us to determine exactly when the upgrade is fully completed and the app deployment is ready to use.