fluxcd/terraform-provider-flux

Uninstalling Flux >= 2.2.2 causes helmreleases.helm.toolkit.fluxcd.io CRD stucks in terminating deadlock

pschirch opened this issue · 6 comments

While uninstalling flux >=2.2.2 from cluster (e.g. in order to updating the git password, see #596) flux CRD helmreleases.helm.toolkit.fluxcd.io stucks in terminating deadlock.

Status:
  Accepted Names:
    Kind:       HelmRelease
    List Kind:  HelmReleaseList
    Plural:     helmreleases
    Short Names:
      hr
    Singular:  helmrelease
  Conditions:
    Last Transition Time:  2024-02-02T13:14:42Z
    Message:               no conflicts found
    Reason:                NoConflicts
    Status:                True
    Type:                  NamesAccepted
    Last Transition Time:  2024-02-02T13:14:42Z
    Message:               the initial names have been accepted
    Reason:                InitialNamesAccepted
    Status:                True
    Type:                  Established
    Last Transition Time:  2024-02-02T13:25:35Z
    Message:               CustomResource deletion is in progress
    Reason:                InstanceDeletionInProgress
    Status:                True
    Type:                  Terminating
  Stored Versions:
    v2beta2

Cause this, it is not possible to install flux again.

│ Error: Bootstrap run error
│ 
│   with module.container_cluster_staging_01.flux_bootstrap_git.container_cluster[0],
│   on .terraform/modules/container_cluster_staging_01/flux.tf line 1, in resource "flux_bootstrap_git" "container_cluster":
│    1: resource "flux_bootstrap_git" "container_cluster" {
│ 
│ timeout waiting for: [CustomResourceDefinition/helmreleases.helm.toolkit.fluxcd.io status: 'Terminating']

To resolve this issue we tried to patch the finalizer.

kubectl patch crd/helmreleases.helm.toolkit.fluxcd.io -p '{"metadata":{"finalizers":[]}}' --type=merge

WARNING: Please do not do this if you do not want to lose all your application data!

The CRD helmreleases.helm.toolkit.fluxcd.io will be removed now and it is possible to install flux again.

In our clean provisioned and reproducible test environment, all previously deployed helm releases are getting redeployed. This also means if the helm chart uses an PVC the PVC is getting recreated and all your data will be lost.

With version 1.1.2 un- and reinstalling flux was no problem. All CRD will be removed and previously deployed kubernetes resources and helm releases will not be removed as mentioned in the documentation.

Currently we can not recommend to use version >=1.2.2 in production.

@stefanprodan This is a critical issue. So if we can help to investigate and resolve this issue, let us know.

Any suggestions?

We are experiencing the same issue with v2.2.3 - a fix would be greatly appreciated

Run flux uninstall to unblock it. I have no idea why TF is stuck, maybe you try to uninstall with an older TF provider?

Run flux uninstall to unblock it. I have no idea why TF is stuck, maybe you try to uninstall with an older TF provider?

Many thanks- resolved with a local exec kubectl on the bootstrap resource.

@kcighon as this issue is resolved are you happy for me to close it?

@kcighon as this issue is resolved are you happy for me to close it?

Yes thank you. 😊