sse-secure-systems/connaisseur

Duplicate MutatingWebhookConfiguration

graipher opened this issue · 3 comments

Describe the bug
There are currently two MutatingWebhookConfiguration resources in the chart (defined here), but they have the same name. While helm ignores this, k8s will (probably) apply only the second one, overwriting the first. But, crucially for me, when trying to inflate the helm chart using kustomize it (properly) fails saying that a resource with that ID already exists.

Expected behavior
The two webhook configurations should have unique names.

To reproduce

  1. Create a folder connaisseur and place these two files inside:

    1. a kustomization.yml file:

      namespace: connaisseur
      resources:
        - namespace.yml
      helmCharts:
        - name: connaisseur
          repo: https://sse-secure-systems.github.io/connaisseur/charts
          namespace: connaisseur
          version: 2.5.0
    2. a namespace.yml file:

      apiVersion: v1
      kind: Namespace
      metadata:
        name: connaisseur
  2. Run kustomize build connaisseur --enable-helm and get the following error:

    Error: could not parse rnode slice into resource map: may not add resource with an already registered id: 
    MutatingWebhookConfiguration.v1.admissionregistration.k8s.io/connaisseur-webhook.[noNs]
    

Versions:

  • OS: Ubuntu 24.04
  • Connaisseur: 3.5.0
  • Connaisseur Helm chart: 2.5.0
  • Helm: v3.15.0-rc.2
  • Kustomize: v5.4.2

Additional context
The two webhook configurations do differ in their annotations, but that is not enough to make them unique.
Currently they are defined as:

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: {{ include "connaisseur.webhookName" . }}
  labels:
    {{- include "connaisseur.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": post-delete
    "helm.sh/hook-delete-policy": before-hook-creation, hook-succeeded, hook-failed
...
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: {{ include "connaisseur.webhookName" . }}
  labels:
    {{- include "connaisseur.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": post-install, post-upgrade, post-rollback
...

Hoy @graipher. The two webhooks with the same name are intended. The reason for that is a bit complicated:

Connaisseur is a admission controller that runs inside the k8s cluster and is hooked up into k8s with a so called mutatingwebhookconfiguration. When this configuration is applied, k8s expects Connaisseur to up and running and do its job. Should that not be the case, e.g. Connaisseur is not yet installed, then k8s will block all resources that Connaisseur would handle. This includes all workloads with image references. Essentially you brick your whole cluster with an active mutatingwebhookconfiguration and no running Connaisseur (you need to manually delte the mutatingwebhookconfiguration).

In order to avoid this situation, we set up the Connaisseur installation, so that the mutatingwebhookconfiguration is only applied after Connaisseur was successfully installed (if we do it during Connaisseur installation, we might run into race conditions, where Connaisseur needs to be applied faster then the mutatingwebhookconfiguration, because the mutatingwebhookconfiguration could block Connaisseur itself). Applying it after the Connaisseur installation is achieved with the helm hooks (e.g. post-install or post-upgrade). Now we run into the next problem: since the mutatingwebhookconfiguration isn't part of the regular helm installation step (this is because its defined as a helm hook), it won't be uninstalled when running helm uninstall, thus yet again creating the situation where we have a mutatingwebhookconfiguration without Connaisseur running -> clusters broken until the mutatingwebhookconfiguration is manually deleted.

How do we solve this problem? The helm hooks have so called delete policies, that automatically delete the given resource in configurable circumstances. We can't set those policies for the regular mutatingwebhookconfiguration, because otherwise the mutatingwebhookconfiguration would be delete after installing Connaisseur, redering the admission controller useless (Connaisseur needs to be hookup into k8s to function). Instead we overwrite the mutatingwebhookconfiguration with another one, but only on the post-delete helm hook (so after someone tries to run helm uninstall). This one has the delete policy set, so it deletes the resource afterwards.

With this setup we achieve our two main goals:

  • the mutatingwebhookconfiguration is applied only after Connaisseur is successfully installed
  • the mutatingwebhookconfiguration is removed once Connaisseur gets deleted with helm uninstall

We also aechive all of this without any users needing to manually apply/delete the mutatingwebhookconfiguration.

Now for your problem .. maybe there is still a solution. Does kustomize alle overwritting the values of the helm chart?

Hey @phbelitz, thanks for the explanation! When I saw that helm hooks are involved, I knew there might be a reason for it, but I was hoping it was just an oversight that the names are actually identical.

Yes, kustomize does allow the values of the helm chart to be overwritten by supplying a values.yml file. Here is the not stripped down version of my kustomization.yml:

namespace: connaisseur
resources:
  - namespace.yml
helmCharts:
  - name: connaisseur
    repo: https://sse-secure-systems.github.io/connaisseur/charts
    releaseName: connaisseur
    namespace: connaisseur
    valuesFile: values.yml
    version: 2.5.0
    includeCRDs: true

Currently I only configure the application.validators list to include my own public key for the private images:

---
application:
  # validator options: https://sse-secure-systems.github.io/connaisseur/latest/validators/
  validators:
    - name: allow
      type: static
      approve: true
    - name: deny
      type: static
      approve: false
    - name: dockerhub
      type: notaryv1
      trustRoots:
        - name: default # root from dockerhub
          key: |
            -----BEGIN PUBLIC KEY-----
            MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEOXYta5TgdCwXTCnLU09W5T4M4r9f
            QQrqJuADP6U7g5r9ICgPSmZuRHP/1AYUfOQW3baveKsT969EfELKj1lfCA==
            -----END PUBLIC KEY-----
        - name: sse # root from sse
          key: |
            -----BEGIN PUBLIC KEY-----
            MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEsx28WV7BsQfnHF1kZmpdCTTLJaWe
            d0CA+JOi8H4REuBaWSZ5zPDe468WuOJ6f71E7WFg3CVEVYHuoZt2UYbN/Q==
            -----END PUBLIC KEY-----
    - name: example
      type: cosign
      trustRoots:
        # the `default` key is used if no key is specified in image policy
        - name: default
          key: |
            -----BEGIN PUBLIC KEY-----
            ...
            -----END PUBLIC KEY-----
  policy:
    - pattern: "*:*"
      validator: deny
    - pattern: "docker.io/library/*:*"
      validator: dockerhub
    - pattern: "docker.io/securesystemsengineering/*:*"
      validator: dockerhub
      with:
        trustRoot: sse
    - pattern: "registry.k8s.io/*:*"
      validator: allow
    - pattern: "registry.example.com/*:*"
      validator: example
  features:
    namespacedValidation:
      mode: validate # 'ignore' or 'validate'
    detectionMode: true
  logLevel: debug