fluxcd/flagger

Proposal: make the `skipAnalysis` in Canary object in the API library a pointer, so that it always gets rendered when the Canary object gets marshalled to json

Wenliang-CHEN opened this issue · 3 comments

Describe the feature

Problem statement

Hey all, I am not sure if I described clearly in the title.

So...we run a use case, in which we use a custom controller to watch the canary objects in the cluster. If any manual change is made to the the canary object, e.g. via the kubectl edit command, the controller always sets it back according to a given template.

Everything works, until the skipAnalysis field was changed manually from "false" to "true". The controller was never able to set it back to "false".

We did some debug, and found out: when the canary is marshalled to json/yaml while the skipAnalysis is "false" or not set, the field is not rendered.

That makes the skipAnalysis field is never communicated via the k8s client. That's how the watch fails.

Proposed solution

The we played around with a forked repo, and tried a couple of things.

Things seem to work if we make the SkipAnalysis field as a pointer instead of a value field, then the "false" value is always rendered. That solves the problem.

There should be no drawback as there is no direct reference to the field from outside of the package. All goes through the SkipAnalysis method. We just need to change the implementation of that method to make things work.

Any alternatives you've considered?

We have tried many different ways. So far only proposed approach works.

hello, can't the absence of the field be interpreted as it being set to false? if someone sets it to true via kubectl edit, that should definitely trigger a reconcile on your custom controller, since the generation of the Canary object would change.

i might be misunderstanding the core issue here, so please feel free to attach any code snippets to elaborate.

Hi @aryan9600 , I think this is exactly where the problem is:

if someone sets it to true via kubectl edit, that should definitely trigger a reconcile on your custom controller

When we set the value to false to a custom CRD.

the following happens:

  • We run a metadatacontroller. it reads the spec in the CRD
  • It sends the spec to our webhook.
    • The webhook constructs the Canary object using the Flagger API. At this step the skyAnalysis: false is still there
    • The webhook then parses the object to json with encoding/json library. we just did a json.NewEncoder(w).Encode(data). At this step, we did a debug and examined the data in json. The skipAnalysis field is totally gone.
  • Finally the metadatacontroller reconciles

Then we did an experiment. We forked the Flagger repo and did this. We used the library pointing to that branch. It fixes the problem for us.

It looks like an edge case. Probably it is only us who does this. But do let us know that you think