Orange-OpenSource/casskop

[Operator-sdk] Rethink the way we update an object

Closed this issue · 3 comments

Today we are working with Operator-sdk v0.9.0: object modifications are managed in updating objects from client side, which is injected with a replace into server side : https://github.com/operator-framework/operator-sdk/blob/master/doc/user/client.md#update

In addition we have old k8s.io dependency : https://github.com/Orange-OpenSource/casskop/blob/master/go.mod#L41

replace (
	k8s.io/api => k8s.io/api v0.0.0-20190222213804-5cb15d344471
	k8s.io/apimachinery => k8s.io/apimachinery v0.0.0-20190221213512-86fb29eff628
	k8s.io/client-go => k8s.io/client-go v0.0.0-20190228174230-b40b2a5939e4
)

With this two things come some issues, because if we have an old version of object client-side and if the old version is pushed to server-side we could have conflits.

To give a concrete example, since Kubernetes 1.15 offers PreemptionPolicy field into the pod spec :
https://github.com/kubernetes/api/blob/master/core/v1/types.go#L3028 :

// PodSpec is a description of a pod.
type PodSpec struct {
...
	// PreemptionPolicy is the Policy for preempting pods with lower priority.
	// One of Never, PreemptLowerPriority.
	// Defaults to PreemptLowerPriority if unset.
	// This field is alpha-level and is only honored by servers that enable the NonPreemptingPriority feature.
	// +optional
	PreemptionPolicy *PreemptionPolicy `json:"preemptionPolicy,omitempty" protobuf:"bytes,31,opt,name=preemptionPolicy"`
...
}

In a k8s cluster with, for example, the following feature-gate:

-feature-gates=AllAlpha=true

The following alpha feature about PreemptionPolicy, will be enabled and all pods will have into their Spec the field PreemptionPolicy set to PreemptLowerPriority.

However when we are performing an update of a pod, for example when we are realizing a rebuild operation, the operator will update the pod labels : https://github.com/Orange-OpenSource/casskop/blob/master/pkg/controller/cassandracluster/pod_operation.go#L206

func (rcc *ReconcileCassandraCluster) startOperation(cc *api.CassandraCluster, status *api.CassandraClusterStatus,
	pod v1.Pod, dcRackName, operationName string) error {
	logrus.WithFields(logrus.Fields{"cluster": cc.Name, "rack": dcRackName, "pod": pod.Name,
		"operation": strings.Title(operationName)}).Info("Start operation")
	labels := map[string]string{"operation-status": api.StatusOngoing,
		"operation-start": k8s.LabelTime(), "operation-end": ""}

	err := rcc.UpdatePodLabel(&pod, labels)
        ...

Then, when we push the pod object built from client-side to server, it's not the same version, because in the podSpec client-side, the PreemptionPolicy field doesn't exist. As we perform a replace, with client.Update,

func (rcc *ReconcileCassandraCluster) UpdatePod(pod *v1.Pod) error {
:

func (rcc *ReconcileCassandraCluster) UpdatePod(pod *v1.Pod) error {
	err := rcc.client.Update(context.TODO(), pod)
        ...

Server-side, the applied pod contains the field PreemptionPolicy with value nil, which leads to the following error :

Pod "cassandra-cl02-live-bgl-c-1" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)
  core.PodSpec{
  	... // 20 identical fields
  	PriorityClassName: "",
  	Priority:          &0,
- 	PreemptionPolicy:  nil,
+ 	PreemptionPolicy:  &"PreemptLowerPriority",
  	DNSConfig:         nil,
  	ReadinessGates:    nil,
  	... // 4 identical fields
  }

Proposition :

  • Upgrading to the latest Operator SDK version, to be up to date on k8s.io modules dependencies.
  • Reviewing the way we update objects using new option : client.Patch.

OK let's upgrade our operator-sdk to 0.14 to start with.
Then we can test the patch "method"

@erdrix any news on that issue ?

we've done it in a couple of places and will do it when changes are needed