operator-framework/operator-sdk

The infinite increase of release version in Helm operator.

Closed this issue · 5 comments

Type of question

Question

What did you do?

I've created a Helm-based app operator that is currently running fine. However, there's a lingering issue bothering me. Though it hasn't caused any major problems yet, it would be better if resolved.

The problem is as follows: We're using ArgoCD based on GitOps to manage our app deployments. The Helm-generated operator automatically creates secret objects like sh.helm.release.v1.myapp.v102 to record version numbers. However, every time the operator-manager checks the status, it updates the version, like sh.helm.release.v1.myapp.v102 to sh.helm.release.v1.myapp.v103, and so on, indefinitely.

Therefore, I'm seeking assistance on how to prevent the Helm operator from automatically generating release version secret objects. I don't need to use Helm's rollback mechanism because I'm using GitOps-based ArgoCD for app deployment management.

image

What did you expect to see?

What did you see instead? Under which circumstances?

Environment

Operator type:

Kubernetes cluster type:

$ operator-sdk version

$ go version (if language is Go)

$ kubectl version

Additional context

Hi @hydra-bu What version of operator-sdk are you using? Do you happen to have any logs from the controller? Or output from other resources?

@hydra-bu Can you check if you have a template that populates a field with a random value in each loop? Something like a secret that contains a random value which the chart generates in each installation. That could explain what you're experiencing.

@hydra-bu Can you check if you have a template that populates a field with a random value in each loop? Something like a secret that contains a random value which the chart generates in each installation. That could explain what you're experiencing.

Thank you for your response. The generation of these release secret objects is the normal behavior of Helm and is automatically generated to record the release version number. I have not found where to control these corresponding fields. If we normally use helm install charts-name, it will also automatically create such a release secret object. However, since we use ArgoCD for deployment management, the secret object automatically created by the Helm operator is not included in the Git repo manifest. Therefore, ArgoCD performs an automatic correction and deletes this extra secret object. Then, the operator controller performs the install again, causing this loop phenomenon. This is my personal speculation.

Here are some log from the operator controller:

{"level":"info","ts":"2024-06-05T04:26:15Z","logger":"helm.controller","msg":"Reconciled release","namespace":"taylor-prod","name":"eric-zuo-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"eric-zuo-ide"}
{"level":"info","ts":"2024-06-05T04:26:19Z","logger":"helm.controller","msg":"Upgraded release","namespace":"taylor-prod","name":"hydra-bu-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"hydra-bu-ide","force":true}
{"level":"info","ts":"2024-06-05T04:27:16Z","logger":"helm.controller","msg":"Reconciled release","namespace":"taylor-prod","name":"eric-zuo-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"eric-zuo-ide"}
{"level":"info","ts":"2024-06-05T04:27:20Z","logger":"helm.controller","msg":"Upgraded release","namespace":"taylor-prod","name":"hydra-bu-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"hydra-bu-ide","force":true}
{"level":"info","ts":"2024-06-05T04:28:16Z","logger":"helm.controller","msg":"Reconciled release","namespace":"taylor-prod","name":"eric-zuo-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"eric-zuo-ide"}
{"level":"info","ts":"2024-06-05T04:28:21Z","logger":"helm.controller","msg":"Upgraded release","namespace":"taylor-prod","name":"hydra-bu-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"hydra-bu-ide","force":true}
{"level":"info","ts":"2024-06-05T04:29:17Z","logger":"helm.controller","msg":"Reconciled release","namespace":"taylor-prod","name":"eric-zuo-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"eric-zuo-ide"}
{"level":"info","ts":"2024-06-05T04:29:22Z","logger":"helm.controller","msg":"Upgraded release","namespace":"taylor-prod","name":"hydra-bu-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"hydra-bu-ide","force":true}
{"level":"info","ts":"2024-06-05T04:30:17Z","logger":"helm.controller","msg":"Reconciled release","namespace":"taylor-prod","name":"eric-zuo-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"eric-zuo-ide"}
{"level":"info","ts":"2024-06-05T04:30:22Z","logger":"helm.controller","msg":"Upgraded release","namespace":"taylor-prod","name":"hydra-bu-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"hydra-bu-ide","force":true}
{"level":"info","ts":"2024-06-05T04:31:18Z","logger":"helm.controller","msg":"Reconciled release","namespace":"taylor-prod","name":"eric-zuo-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"eric-zuo-ide"}
{"level":"info","ts":"2024-06-05T04:31:23Z","logger":"helm.controller","msg":"Upgraded release","namespace":"taylor-prod","name":"hydra-bu-ide","apiVersion":"test.example.com/v1alpha1","kind":"CloudIDE","release":"hydra-bu-ide","force":true}

Looking at the secret status, you can see that the version number is constantly increasing:

image

image

If Argo is deleting any resource that the operator is creating, the operator will reconsile and re-create that resource, this would be normal expectation. It seems like the Argo setup is not correct, IMO Argo should only create the Subscription for the operator, and nothing else related to helm.

#6691 fixed the issue since v1.35.0 for me when using with dryRunOption: server