flux-iac/tofu-controller

Tf-runner pod not created, kind Terraform stuck in "Reconciliation in progress"

manicole opened this issue · 4 comments

Hi all,
I'm trying to deploy a kind Terraform via a kind Kustomization, but often no tf-runner is created and my Terraform stays stuck in "Reconciliation in progress" state.

I'm trying to deploy this Kustomization:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: k3s
  namespace: k3s
spec:
  targetNamespace: k3s
  interval: 10m
  retryInterval: 1m
  timeout: 5m
  sourceRef:
    kind: GitRepository
    name: gitrepo
    namespace: flux-system
  path: /k3s/manifests
  prune: true
  decryption:
    provider: sops
    secretRef:
      name: sops-age
  postBuild:
    substitute:
      service: k3s
      previous: openstack
      target: k3s
      destroy: "false"

In gitrepo/k3s/manifests, there is only my kind Terraform:

apiVersion: infra.contrib.fluxcd.io/v1alpha2
kind: Terraform
metadata:
  name: ${service}
spec:
  alwaysCleanupRunnerPod: false
  serviceAccountName: ${target}
  interval: 1m
  destroy: ${destroy}
  destroyResourcesOnDeletion: true
  approvePlan: auto
  path: "/k3s/terraform"
  sourceRef:
    kind: GitRepository
    name: gitrepo
    namespace: flux-system
   dependsOn:
   - name: ${previous}
  vars:
  - name: service
    value: ${service}
  - name: previous
    value: ${previous}
  - name: target
    value: ${target}
  varsFrom:
  - kind: Secret
    name: ${previous}-output
    varsKeys:
    - instance_ip
    - instance_ssh_key
  writeOutputsToSecret:
    name: ${service}-output
  runnerPodTemplate:
    spec:
      volumes:
      - name: tmp
        emptyDir: {}
      volumeMounts:
      - name: tmp
        mountPath: "/tmp"

What happens:

  1. I apply the Kustomization file
  2. Kustomization is deployed and ready
  3. Terraform is deployed, state "Unknown" and status "Reconciliation in progress"
  4. Nothing appears in Namespace k3s, although I would expect a k3s-tf-runner to be deployed...

Additionnal info:

  • I deploy plenty of Kustomization files, and do not have any problem with the others.

  • Sometimes by chance (non reproducible pattern) a k3s-tf-runner is created and everything works fine till the end ; a few hours later, in spite of the arg alwaysCleanupRunnerPod: false, the k3s-tf-runner disappears ... and I can't create it back.

  • I can't find any clue of what's happening in tf-controller logs nor in namespace events.

  • All my kind Terraform does (gitrepo/k3s/terraform/main.tf) is a local exec:

resource "null_resource" "k3s" {
  provisioner "local-exec" {
    command = <<EOT
    mkdir -p ${var.instance_ssh_folder}
    echo -e '${var.instance_ssh_key}' >${var.instance_ssh_folder}/${var.instance_ssh_file}
    mkdir -p ${var.instance_kubeconfig_folder}
    touch ${var.instance_kubeconfig_folder}/${var.instance_kubeconfig_file}
    wget https://github.com/alexellis/k3sup/releases/download/0.13.6/k3sup
    chmod +x k3sup
    ./k3sup install \
    --ip ${var.instance_ip} \
    --user ${var.instance_user} \
    --ssh-key ${var.instance_ssh_folder}/${var.instance_ssh_file} \
    --cluster \
    --local-path ${var.instance_kubeconfig_folder}/${var.instance_kubeconfig_file} \
    --context ${var.instance_kubeconfig_context}
    EOT
  }
}
  • Versions:
$ kubectl version
Client Version: v1.29.3+k3s1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.3+k3s1

$ helm ls -n flux-system
NAME         	NAMESPACE  	REVISION	UPDATED                              	STATUS  	CHART                       	APP VERSION
tf-controller	flux-system	1       	2024-06-19 13:13:30.0861225 +0000 UTC	deployed	tf-controller-0.16.0-alpha.3	v0.16.0-alpha.3

Thanks in advance for helping :)

I found a new element.

Commenting the runnerPodTemplate spec from my kind Terraform makes k3s-tf-runner appear (once FluxCD reconciles my Kustomization).
Uncommenting it afterwards does not seem to be a problem, but who knows how long for.

Anyone ?

I still have the issue: I have to apply my kind Terraform without any runnerPodTemplateto launch the tf-runner Pod (otherwise, it never appears). Once launched, I can add a runnerPodTemplate to my my kind Terraform and reapply to update it.

Hello @manicole ,
Any specific reason you need to define the tmp volume in the runner pod template?

Hi @akselleirv, thanks for reacting.

Actually I am still trying to solve a problem, and thought it was the way. I install k3s with k3sup (i.e. using ssh) and ouput the kubeconfig from the Terraform plan:

# outputs.tf
data "local_file" "kubeconfig_file" {
  filename   = "${var.instance_kubeconfig_folder}/${var.instance_kubeconfig_file}"
  depends_on = [null_resource.k3s]
}

output "kubeconfig_file" {
  description = "kubeconfig to access k3s cluster"
  value       = nonsensitive(data.local_file.kubeconfig_file.content)
}

I get the following error, and thought mounting tmp would be enough but it is not:

Error: Read local file data source error

  with data.local_file.kubeconfig_file,
  on outputs.tf line 1, in data "local_file" "kubeconfig_file": 
   1: data "local_file" "kubeconfig_file" { 

The file at given path cannot be read.

  Original Error: open /tmp/.kube/config: no such file or directory

This might be another issue to solve for me, but I believe it has no consequences on the problem here.
Thanks