mariadb-operator/mariadb-operator

[Bug] rootPasswordSecretKeyRef race condition (provided Secret / Generated by operator)

Closed this issue ยท 9 comments

Documentation

Describe the bug

the spec.rootPasswordSecretKeyRef has two roles:

  • use to get the initial root password
  • use as Secret name which is generated by the operator, if missing

When the MariaDb CR is used in same Kustomization as a password generator (ESO External Secret Operator, which generates a Secret), the result is not predictable. Sometimes ESO generates the Secret before MariaDb operator, sometime ESO is faster.

Expected behaviour

Steps to reproduce the bug

  1. ...
  2. ...
  3. ...

Debug information

  • Related object events:
kubectl get events --field-selector involvedObject.name=<mariadb-resource-name>
kubectl get events --field-selector involvedObject.name=<backup-resource-name>
kubectl get events --field-selector involvedObject.name=<restore-resource-name>
  • mariadb-operator logs. Set the --log-level to debug if needed.

Environment details:

  • Kubernetes version: [Version number]
  • Kubernetes distribution: [Vanilla, EKS, GKE, AKS, Rancher, OpenShift, k3s, KIND...]
  • mariadb-operator version: [Version number]
  • Install method: [helm, OLM, or static manifests]
  • Install flavor: [minimal, recommended, or custom]

Additional context

Hey @poblin-orange ! Thanks for reporting.

@K-MeLeOn reported a very similar issue a while ago:

It is the same root cause: If the operator doesn't find the Secret related to the root password by the time it reconciles, it creates one. See the Secret reconciler logic for further detail

func (r *SecretReconciler) ReconcileRandomPassword(ctx context.Context, req *RandomPasswordRequest) (string, error) {

If the Secret is created asynchronously by an external operator (ESO or SealedSecrets), it can happen that it is not created by the time mariadb-operator kicks in.

We could introduce a new generate field in the Secret references to indicate that we don't want the mariadb-operator to generate a random Secret, instead, we will rely on an external controller:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password
    generate: false

All the Secret references fields are optional, if not provided they will default to:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password
    generate: true

In order to maintain backwards compatibility, if the user provides only the Secret reference:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password

the new generate field will also be defaulted to true:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password
    generate: true

@poblin-orange what do you think?
@K-MeLeOn your input here will be very much appreciated.

Just to add some context. Integrating with external-secrets-operator and sealed-secrets is crucial, as it enables potential GitOps uses cases:

  • Configure a MariaDB with a root password provided as a SealedSecret and reconciled into a Secret by sealed-secrets
  • Configure a MariaDB with a root password provided as a ExternalSecret which has a reference to a Vault key and it is reconciled into a Secret by ESO

@mmontes11 yes, a flag to configure would be perfect for our use cases, letting Gitops config be explicit about Secret ownership !

@poblin-orange If you're using ArgoCD you can configure the deployment like that :

---
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: [...]
  namespace: [...]
  annotations:
    # Before MariaDb to avoid password generation
    argocd.argoproj.io/sync-wave: "-10"
spec:
  [...]

Hey @poblin-orange ! Thanks for reporting.

@K-MeLeOn reported a very similar issue a while ago:

* [[Bug]  SealedSecret error with the MariaDb random password generation #340](https://github.com/mariadb-operator/mariadb-operator/issues/340)

It is the same root cause: If the operator doesn't find the Secret related to the root password by the time it reconciles, it creates one. See the Secret reconciler logic for further detail

func (r *SecretReconciler) ReconcileRandomPassword(ctx context.Context, req *RandomPasswordRequest) (string, error) {

If the Secret is created asynchronously by an external operator (ESO or SealedSecrets), it can happen that it is not created by the time mariadb-operator kicks in.

We could introduce a new generate field in the Secret references to indicate that we don't want the mariadb-operator to generate a random Secret, instead, we will rely on an external controller:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password
    generate: false

All the Secret references fields are optional, if not provided they will default to:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password
    generate: true

In order to maintain backwards compatibility, if the user provides only the Secret reference:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password

the new generate field will also be defaulted to true:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  rootPasswordSecretKeyRef:
    name: mariadb
    key: root-password
    generate: true

@poblin-orange what do you think? @K-MeLeOn your input here will be very much appreciated.

This seems correct to me, but I don't know if it's the operator's role to manage this or the gitops' role.

In the case of the gitops, its role is to coordinate deployment to avoid this type of problem.
In the case of the operator, its role is to coordinate the configuration of what is being managed.

In both cases, I think it's interesting to implement this on the operator for people like me and @poblin-orange now, who have knowledge gaps on their gitops application. On the other hand, it's interesting for the operator to offer the possibility of managing password generation or not ๐Ÿ‘

I agree the operator CR must be as explicit as possible, avoiding any "side effect" tricks, which induces subtle errors (gitops behind an example catching the error, could be hidden in non gitops contexts).

@K-MeLeOn not using Argo, but FluxCD. To workaround, need to define a separate Kustomization with dependsOn.

Great, it seems like all of us are in agreement. I will try to squeeze this into v0.0.28.

One more thought: Generating passwords should probably be opt-in by providing generate = true. This basically means that the GitOps use cases we mentioned should work by default with the current API. This decision will be reflected in documentation and in the release notes of course!

Supported via #598

Stay tuned for v0.0.0.28!