jetstack/navigator

elasticsearch: reverting the Version field after upgrade has begun will not trigger rollback

munnerz opened this issue · 0 comments

This bug effects both 0.1 and master releases of Navigator.

Due to the way the update_version action is implemented, if a user:

  • Updates a ESC consisting of 3 nodes in a single node pool from 6.1.1 to 6.1.3
  • Waits for at least 1 node to upgrade, but before the upgrade has finished that node pool
  • Reverts the version field back to 6.1.1

Navigator will class the upgrade as 'complete' and not roll back or finish rolling forward the nodes in the pool.

This is because we use an annotation to track the 'current' version of the StatefulSet during an upgrade:

currentVersionStr, ok := statefulSet.Annotations[v1alpha1.ElasticsearchNodePoolVersionAnnotation]
if !ok {
err = fmt.Errorf("StatefulSet %q does not have an Elasticsearch version annotation", statefulSet.Name)
state.Recorder.Event(c.Cluster, core.EventTypeWarning, "Err"+c.Name(), err.Error())
return nil
}
// attempt to parse the version
currentVersion, err := semver.NewVersion(currentVersionStr)
if err != nil {
err = fmt.Errorf("Invalid version string %q on statefulset %q: %v", currentVersionStr, statefulSet.Name, err)
state.Recorder.Event(c.Cluster, core.EventTypeWarning, "Err"+c.Name(), err.Error())
return nil
}
// this means the statefulset is already up to date. exit early.
if c.Cluster.Spec.Version.Equal(*currentVersion) {
return nil
}
.

This field is not updated until the node pool has been completed updated, meaning if the update is aborted half way through, the annotation will still be set to the old value and so on the next sync iteration, it will detect there are no changes to perform (as currentVersionStr == version str in annotation).

/kind bug