hashicorp/terraform-provider-kubernetes

Default secret no longer being generated for service account, with Kubernetes 1.24.0

CRidge opened this issue Β· 13 comments

Terraform version, Kubernetes provider version and Kubernetes version

Terraform version: 1.2.1
Kubernetes Provider version: 2.11.0
Kubernetes version: 1.24.0
Docker Desktop version: 4.8.2

Terraform configuration

Original version:

resource "kubernetes_service_account" "debugging" {
  metadata {
    name      = "serviceaccount"
    namespace = "debugging"
  }
  automount_service_account_token = true
}

Updated version:

resource "kubernetes_secret" "debugging" {
  metadata {
    name      = "serviceaccount-token-secret"
    namespace = "debugging"
    annotations = {
      "kubernetes.io/service-account.name"      = "serviceaccount"
      "kubernetes.io/service-account.namespace" = "debugging"
    }
  }
  type = "kubernetes.io/service-account-token"
}

resource "kubernetes_service_account" "debugging" {
  metadata {
    name      = "serviceaccount"
    namespace = "debugging"
  }
  automount_service_account_token = false
}

Question

I'm converting some yaml-files to Terraform and have found an issue I can't seem to get around.

I'm creating a service account, which worked find when using kubectl. However, through Terraform I just get error like Waiting for default secret of "debugging/serviceaccount" to appear.

With debug output on, I see Configuration contains 0 secrets, saw 0, expected 1 repeated several times before the apply fails.

I see that since Kubernetes 1.24.0, service accounts no longer get this secret automatically generated (with or without automount_service_account_token set). I've tried the recommended new way of doing this (updated version above), but it still fails as before.

Kubernetes 1.24.0 release notes with original issue

When looking at this code:

log.Printf("[DEBUG] Configuration contains %d secrets, saw %d, expected %d", len(config.Secrets), len(resp.Secrets), len(config.Secrets)+1)

... it seems a service account cannot go through without a secret being "magically" added behind the scenes. Is this a bug/regression when using Kubernetes 1.24.0, or am I missing something?

Some additional info

I sometimes also see this error, but not always:

Error: Provider produced inconsistent result after apply
When applying changes to module.infra.kubernetes_secret.debugging, provider "module.infra.provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: Root resource was present, but now absent.

Sometimes the secrets are created as they should, other times the apply exits before all resources are deployed. When everything is in fact deployed, everything works as it should with the application - it's just the deployment which fails.

Have encountered this problem too, not possible to use the provider with kubernetes_service_accounts in v1.24.x. Seems to stem from the provider looking for n+1 secrets, where n is the quantity of manual secrets attached to the service account in the resource, and +1 being the usual default service account token. Without that extra secret (which is also checked to make sure it is a service account type secret), the validation will fail.

Going to look at converting my resources related to this issue into a Helm chart instead for the time being.

If you can't downgrade your cluster, you can use kubernetes_manifest as a workaround:

resource "kubernetes_manifest" "service_account" {
  manifest = {
    "apiVersion" = "v1"
    "kind"       = "ServiceAccount"
    "metadata" = {
      "namespace" = "test"
      "name"      = "my-serviceaccount"
    }

    "automountServiceAccountToken" = true
    "imagePullSecrets" = [
      {
        "name" = "my-image-pull-secret"
      },
    ]
  }
}

We at SAS have run into this problem when attempting to create a managed K8s 1.24.0 cluster in Azure using the following versions.

Terraform version: v1.0.0
Required provider source: hashicorp/azurerm
Kubernetes Version: 1.24.0

β”‚ Error: Waiting for default secret of "kube-system/hehewl-1-cluster-admin-sa" to appear
β”‚
β”‚ with module.kubeconfig.kubernetes_service_account.kubernetes_sa[0],
β”‚ on modules/kubeconfig/main.tf line 45, in resource "kubernetes_service_account" "kubernetes_sa":
β”‚ 45: resource "kubernetes_service_account" "kubernetes_sa" {
β”‚

Downgrading K8s isn't an option for us as the goal of our work is to use K8s 1.24.x. We will consider if the proposed work around is viable for our work. This issue hasn't been updated in some time. Is there any new information on the situation and when it might be addressed?

@alexsomesan has submitted a fix already in commit c178cdc but it seems like his solution doesn't work completely. The mentioned commit is included in the actual 2.11.

The changes for this PR, #1634 were working under the assumption that there is a default secret for a Service Account. There is no longer a default secret in K8s 1.24.

Yes, youβ€˜re right. The PR has nothing to do with the changed behavior in k8s 1.24

Is there any update on this? GKE now has 1.24 available in preview, but I cannot use terraform to install it. I'm seeing the same issue,

β”‚ Error: Waiting for default secret of "kube-system/hehewl-3-cluster-admin-sa" to appear
β”‚
β”‚ with module.kubeconfig.kubernetes_service_account.kubernetes_sa[0],
β”‚ on modules/kubeconfig/main.tf line 42, in resource "kubernetes_service_account" "kubernetes_sa":
β”‚ 42: resource "kubernetes_service_account" "kubernetes_sa" {

Can his be changed to include a bug label please?

So as @hahewlet stated above we're now seeing this on Azure and GCP. AWS has not released k8s 1.24 at this time, so no word if there's a problem there. Can someone look at addressing this as it's a major change for those moving from k8s 1.23 -> 1.24 This will cause issues with folks depending on the default_secret. Thx in advance.

Observing this issue in local clusters using the latest k3s version as well.

Is there any plan on how this could/should be fixed in the provider? terraform-provider-kubernetes will be broken with more and more clusters that upgrade to 1.24…

LInk to the breaking change

The LegacyServiceAccountTokenNoAutoGeneration feature gate is beta, and enabled by default. When enabled, Secret API objects containing service account tokens are no longer auto-generated for every ServiceAccount. Use the TokenRequest API to acquire service account tokens, or if a non-expiring token is required, create a Secret API object for the token controller to populate with a service account token by following this guide. (kubernetes/kubernetes#108309, @zshihang)

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.