fluxcd/flux

Error looking up service account XXX because it is not found, although it's in the manifest

doctorpangloss opened this issue · 3 comments

Describe the bug

When I define a ServiceAccount and a Pod that uses it in the same file in an empty (implicit Kustomize) directory in my flux managed git repo, Flux fails to reconcile. I expect it to work, because it is super routine.

Steps to reproduce

  1. Create an 1.22 EKS cluster with flux installed via eksctl.
  2. Write the calicoctl.yaml to an empty kube-system directory in your repo:
    mkdir -pv kube-system
    curl -L https://projectcalico.docs.tigera.io/manifests/calicoctl.yaml > ./kube-system/calicoctl.yaml
    
  3. Observe the following failure:
flux-system     kustomization/flux-system       False   Pod/kube-system/calicoctl dry-run failed, reason: Forbidden, error: pods "calicoctl" is forbidden: error looking up service account kube-system/calicoctl: serviceaccount "calicoctl" not found       main/d53c4c388b4fac63493368e5794b4cdb21604809   False
  1. Observe kubectl apply -f calicoctl.yaml works just fine.

Expected behavior

It should reconcile.

Kubernetes version / Distro / Cloud provider

EKS v1.22.6

Flux version

flux: v0.24.1

Git provider

GitHub

Container Registry provider

No response

Additional context

No response

Maintenance Acknowledgement

  • I am aware of Flux v1's maintenance status

Code of Conduct

  • I agree to follow this project's Code of Conduct

Greetings! I am not certain if this is meant to be reporting a regression, or if there is some other behavior change addressed here. Flux v1 is in maintenance mode (#3320) and cannot have any behavior changes, unless it is to address a regression.

There is dependency ordering behavior in Flux v2 and I suspect your issue is addressed otherwise already by Flux v2. I understand that eksctl still bundles Flux v1, but we recommend all Flux users to migrate from Flux v1 to Flux v2 according to the Migration Timetable document which has been published in essence since 2020, though it has been updated periodically since then, the recommended guidance remains to move away from Flux v1.

If this change represents a behavior difference then it cannot be accepted into Flux v1, as Flux follows semver. This is unfortunate with respect to some ways that Flux v1 might not work how we wanted it to, but it is in long-term maintenance mode and stability is the number one guarantee requirement for us now.

The behavior changes have all gone into Flux v2 which is getting ready for CNCF Graduation and GA, and can still have behavior changes from time to time though a stable contract must be guaranteed for the most part to satisfy API stability.

In Flux v2, if you need to order the application of certain manifests, you can set them up in separate Kustomizations and implement an optional spec.dependsOn connection referring from the depending to the depended reconciler, then the order of application is guaranteed when it comes from the same source GitRepository.

Does this help at all? I'm sorry if this is not the answer you were looking for today!

I meant to report this in Flux v2. I am using Flux v2.

I think there might be some room to update the behavior in Flux v2. I had this issue myself defining a pod and a service account together, the issue was resolved by creating a deployment and a service account instead. The pod and service account pair does have an issue, I think it is because the pod is ephemeral and doesn't have any lifecycle on its own.

When I created the deployment instead of a pod, there was no issue with service account and deployment creation order.