operator-framework/operator-sdk

During reconciliation client.Update call consider default GOLang values as not initialized

SevcikMichal opened this issue · 8 comments

Bug Report

What did you do?

Created a custom resource based on a CRD definition generated from a type struct using provided make commands (make generate manifests).
Here is a snippet of the type with one of the fields important for this issue (Preload bool):

type WebComponentSpec struct {
	// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
	// Important: Run "make" to regenerate code after modifying this file

	// The URI from which the module shall be accessed. The actual module is cached by the controller to improve performance and avoid CORS issues.
	// +kubebuilder:validation:Format=url
	// +operator-sdk:csv:customresourcedefinitions:type=spec
	ModuleUri string `json:"module-uri"`

	// The modules are not preloaded by default but only when navigating to some of the subpaths mentioned in the 'navigation' list. Setting this property to true ensures that the module is loaded when the application starts.
	// +kubebuilder:default=true
	// +operator-sdk:csv:customresourcedefinitions:type=spec
	Preload bool `json:"preload,omitempty"`
}

The Preload field is then represented like this in the generated crd.yaml:

preload:
  default: true
  description: The modules are not preloaded by default but only when navigating to some of the subpaths mentioned in the 'navigation' list. Setting this property to true ensures that the module is loaded when the application starts.
  type: boolean

I created a CR with this value explicitly set to false as seen here (removed not interesting fields to make it as small as possible):

apiVersion: microfrontend.michalsevcik.dev/v1alpha1
kind: WebComponent
metadata: 
  name: msevcik-ambulance-ufe
spec:   
  module-uri: http://msevcik-ambulance-ufe.wac-hospital/build/ambulance-list.esm.js
  preload: false

During the reconcile function I add the finalizers and update the resource:

        if !controllerutil.ContainsFinalizer(webComponent, webComponentFinalizer) {
		log.Info("Adding Finalizer for WebComponent.")
		if ok := controllerutil.AddFinalizer(webComponent, webComponentFinalizer); !ok {
			log.Error(err, "Failed to add finalizer into the custom resource!")
			return ctrl.Result{Requeue: true}, nil
		}

		if err = r.Update(ctx, webComponent); err != nil {
			log.Error(err, "Failed to update custom resource to add finalizer!")
			return ctrl.Result{}, err
		}

		return ctrl.Result{}, nil
	}

What did you expect to see?

I expected the finalizers field to be added to the CR and no change in the spec part of the CR.

What did you see instead? Under which circumstances?

CRs spec was updated and preload was changed to true.

Environment

Operator type:

/language go

Kubernetes cluster type:

docker-desktop

$ operator-sdk version

operator-sdk version: "v1.29.0", commit: "78c564319585c0c348d1d7d9bbfeed1098fab006", kubernetes version: "v1.26.0", go version: "go1.20.4", GOOS: "darwin", GOARCH: "arm64"

$ go version (if language is Go)

go version go1.20.4 darwin/arm64

$ kubectl version

Client Version: v1.25.4
Kustomize Version: v4.5.7
Server Version: v1.25.4

Possible Solution

I have tested this as well with an int value where I set the // +kubebuilder:default=1.
Since go int default is 0 I explicitly set the value in CR to 0.
After update call the int value was updated to 1 as per kubebuilder default even though it was explicitly set to 0 during apply.

I am not sure what the solution could be there but as workaround I defined the bool value as a pointer (*bool) so the default value in go is nil instead of false. I am not sure if this is an expected behavior or not.

Additional context

It is hard for me to currently pinpoint where the problem could be maybe I should have created this issue in controller-runtime or maybe it is even problem in Kubernetes implementation of docker-desktop. However I did not stumble upon same issue during my google searches therefore I hope it will serve someone that might face the same issue in future.

Based on the code snippet provided, looks like there is only removal of the finalizer. Removing of the finalizer changes the metadata and shouldn't be modifying the spec of the resource. However, if the update is causing a change then its more likely because of some action being done in doFinalizerOperationsForWebComponent. Could you share what the method does? Reviewing its implementation could be provide more insights on what is happening with the object. And just to confirm, do you see any errors in the logs causing the object to be requeued?

Apologies I pasted wrong snippet..
I meant to paste in this one (updated in the issue as well):

if !controllerutil.ContainsFinalizer(webComponent, webComponentFinalizer) {
		log.Info("Adding Finalizer for WebComponent.")
		if ok := controllerutil.AddFinalizer(webComponent, webComponentFinalizer); !ok {
			log.Error(err, "Failed to add finalizer into the custom resource!")
			return ctrl.Result{Requeue: true}, nil
		}

		if err = r.Update(ctx, webComponent); err != nil {
			log.Error(err, "Failed to update custom resource to add finalizer!")
			return ctrl.Result{}, err
		}

		return ctrl.Result{}, nil
	}

I was also debugging the code and if I put the break point right on the line where the r.Update is happening I can see that before executing the Update the Preload bool is false. After I call the update it gets set to true.

And I am not updating any of the spec fields in the code I only propagate them to a frontend application via an api no values are written to only read.

The AddFinalizer method only appends the finalizer and does not interact with the client. If there is nothing else updating the resource, then it should not be changing the spec. This needs more investigation by comparing the latest stored version in etcd and the output returned by server. (Though I wouldn't expect garbage collection to be happening so soon causing a new object with default values to be returned - this could be checked by tracking the resourceVersion)

I made a short video showcasing the issue:
https://www.youtube.com/watch?v=YLc-5gXQ3b0

Here is the branch that is used in the video (on main I have the workaround):
https://github.com/SevcikMichal/microfrontends-controller/tree/default-values-issue

I could also reproduce this with e.g. int which default value is 0, if I explicitly set the int value to 0, but specify default value with kubebuilder to 1, after update the value becomes 1.

This issue was discussed in the issue triage meeting. The possible reason for this is as you mentioned is setting the value to be boolean instead of a pointer. Looks like the default value for the boolean field is taken as false. Hence after removing the finalizer, the K8s api server is not able distinguish if that's a default value or something that is been explicitly set. Hence it converts it to a default value. As you mentioned, the correct way is to pass the boolean values as pointers.

@SevcikMichal Hope the above comment answers your question. If not, please feel free to reopen the issue. Thank you!

@varshaprasad96
Thanks for the confirmation.
It would be nice to have this covered somewhere in the documentation so if somebody stumbles upon in the future at least they know it is behaving as expected.

@SevcikMichal Makes sense. But this would be a more apt issue on the kubebuilder side as we inherit the markers from there to set defaults. Would you mind opening a PR to add details on this issue to the kubebuilder book in the markers section (https://kubebuilder.io/reference/controller-gen.html?highlight=default,maekers#generators) as a note? That would be really helpful.