With v1.2.5 failed to start - lib64/libc.so.6: version `GLIBC_2.34' not found

Question

With v1.2.5 failed to start - lib64/libc.so.6: version `GLIBC_2.34' not found

1337andre opened this issue a year ago · 3 comments

Hi folks,

with image quay.io/redhat-cop/namespace-configuration-operator:v1.2.5

namespace-configuration-operator failed to start

lib64/libc.so.6: version `GLIBC_2.34' not found

#kubectl describe pod namespace-configuration-operator-67dc8fdbbb-z7xwd
Name:                 namespace-configuration-operator-67dc8fdbbb-z7xwd
Namespace:            default
Priority:             1
Priority Class Name:  low
Service Account:      controller-manager
Node:                 rxcmpk8s19.hcp-infra.blub.de/172.31.102.101
Start Time:           Tue, 14 Nov 2023 08:26:13 +0100
Labels:               app.kubernetes.io/instance=namespace-configuration-operator
                      app.kubernetes.io/name=namespace-configuration-operator
                      control-plane=namespace-configuration-operator
                      pod-template-hash=67dc8fdbbb
Annotations:          cni.projectcalico.org/containerID: 24d87fc3831cd061e868876406b8b5cb1c393f09a49b7a21ec17f6012596ab9c
                      cni.projectcalico.org/podIP: 10.4.192.202/32
                      cni.projectcalico.org/podIPs: 10.4.192.202/32
                      kubectl.kubernetes.io/restartedAt: 2023-09-01T11:43:34+02:00
Status:               Running
IP:                   10.4.192.202
IPs:
  IP:           10.4.192.202
Controlled By:  ReplicaSet/namespace-configuration-operator-67dc8fdbbb
Containers:
  kube-rbac-proxy:
    Container ID:  containerd://7c5362a8759c282136bc91d150d392d3c07dad903a7645a3d2feafcf80c76cb0
    Image:         quay.io/redhat-cop/kube-rbac-proxy:v0.11.0
    Image ID:      quay.io/redhat-cop/kube-rbac-proxy@sha256:c68135620167c41e3d9f6c1d2ca1eb8fa24312b86186d09b8010656b9d25fb47
    Port:          8443/TCP
    Host Port:     0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --tls-cert-file=/etc/certs/tls/tls.crt
      --tls-private-key-file=/etc/certs/tls/tls.key
      --v=10
    State:          Running
      Started:      Tue, 14 Nov 2023 08:26:15 +0100
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        100m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /etc/certs/tls from namespace-configuration-operator-certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln4t4 (ro)
  namespace-configuration-operator:
    Container ID:  containerd://7773f9892338e07a8f25b4fd2b2c1907598555e5fa442085716ab02f3e12a18c
    Image:         quay.io/redhat-cop/namespace-configuration-operator:v1.2.5
    Image ID:      quay.io/redhat-cop/namespace-configuration-operator@sha256:20debaa7b91aebf034a0bd2baa80d37bac2b23be5c76efc2a6edbc14f942b2b1
    Port:          <none>
    Host Port:     <none>
    Command:
      /manager
    Args:
      --leader-elect
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 14 Nov 2023 08:37:17 +0100
      Finished:     Tue, 14 Nov 2023 08:37:17 +0100
    Ready:          False
    Restart Count:  7
    Requests:
      cpu:        100m
      memory:     20Mi
    Liveness:     http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:    http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /tmp/k8s-webhook-server/serving-certs from webhook-server-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln4t4 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  namespace-configuration-operator-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  namespace-configuration-operator-certs
    Optional:    false
  webhook-server-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  webhook-server-cert
    Optional:    false
  kube-api-access-ln4t4:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  14m                   default-scheduler  Successfully assigned default/namespace-configuration-operator-67dc8fdbbb-z7xwd to rxcmpk8s19.hcp-infra.blub.de
  Normal   Pulled     14m                   kubelet            Container image "quay.io/redhat-cop/kube-rbac-proxy:v0.11.0" already present on machine
  Normal   Created    14m                   kubelet            Created container kube-rbac-proxy
  Normal   Started    14m                   kubelet            Started container kube-rbac-proxy
  Normal   Pulling    14m                   kubelet            Pulling image "quay.io/redhat-cop/namespace-configuration-operator:v1.2.5"
  Normal   Pulled     14m                   kubelet            Successfully pulled image "quay.io/redhat-cop/namespace-configuration-operator:v1.2.5" in 6.834695195s (6.834708313s including waiting)
  Normal   Created    13m (x4 over 14m)     kubelet            Created container namespace-configuration-operator
  Normal   Started    13m (x4 over 14m)     kubelet            Started container namespace-configuration-operator
  Normal   Pulled     13m (x3 over 14m)     kubelet            Container image "quay.io/redhat-cop/namespace-configuration-operator:v1.2.5" already present on machine
  Warning  BackOff    4m27s (x52 over 14m)  kubelet            Back-off restarting failed container namespace-configuration-operator in pod namespace-configuration-operator-67dc8fdbbb-z7xwd_default(4d535843-f87d-4e36-86d8-669881710c75)

Answer 1 · 2023-11-22T10:07:34.000Z

Failed version 1.2.5 has just been published into operatorHub community-operators
redhat-openshift-ecosystem/community-operators-prod#3593

And so the error is starting to arrive to anyone using the operator through operatorHub:

2023-11-22T09:54:58Z INFO Starting workers {"controller": "namespaceconfig", "controllerGroup": "redhatcop.redhat.io", "controllerKind": "NamespaceConfig", "worker count": 1}
2023-11-22T09:54:58Z INFO controllers.NamespaceConfig reconciling started {"namespaceconfig": {"name":"basic-user-namespace-monitoring"}}
2023-11-22T09:54:59Z INFO Observed a panic in reconciler: interface conversion: validation.Schema is *validation.schemaValidation, not *validation.NullSchema {"controller": "namespaceconfig", "controllerGroup": "redhatcop.redhat.io", "controllerKind": "NamespaceConfig", "NamespaceConfig": {"name":"basic-user-namespace-monitoring"}, "namespace": "", "name": "basic-user-namespace-monitoring", "reconcileID": "d1c4208b-33eb-444a-941c-d2af11af0852"}
panic: interface conversion: validation.Schema is *validation.schemaValidation, not *validation.NullSchema [recovered]
panic: interface conversion: validation.Schema is *validation.schemaValidation, not *validation.NullSchema
goroutine 209 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.2/pkg/internal/controller/controller.go:115 +0x1e5
panic({0x22be620?, 0xc0041aa660?})
/opt/hostedtoolcache/go/1.21.4/x64/src/runtime/panic.go:914 +0x21f
github.com/redhat-cop/operator-utils/pkg/util/lockedresourcecontroller.(*LockedResourceManager).validateLockedResources(0xc000854380, {0xc000990800, 0x34, 0xc000637a60?})
/home/runner/go/pkg/mod/github.com/redhat-cop/operator-utils@v1.3.7/pkg/util/lockedresourcecontroller/locked-resource-manager.go:341 +0x6f3
github.com/redhat-cop/operator-utils/pkg/util/lockedresourcecontroller.(*LockedResourceManager).SetResources(0xc000854380, {0xc000990800?, 0x34, 0x40})
/home/runner/go/pkg/mod/github.com/redhat-cop/operator-utils@v1.3.7/pkg/util/lockedresourcecontroller/locked-resource-manager.go:82 +0x77
github.com/redhat-cop/operator-utils/pkg/util/lockedresourcecontroller.(*LockedResourceManager).Restart(0xc000854380, {0x27c5e68, 0xc00075e660}, {0xc000990800, 0x34, 0x40}, {0x3677360, 0x0, 0x0}, 0x0, ...)
/home/runner/go/pkg/mod/github.com/redhat-cop/operator-utils@v1.3.7/pkg/util/lockedresourcecontroller/locked-resource-manager.go:222 +0x16c
github.com/redhat-cop/operator-utils/pkg/util/lockedresourcecontroller.(*EnforcingReconciler).UpdateLockedResourcesWithRestConfig(0xc0000f0f20, {0x27c5e68, 0xc00075e660}, {0x27db750?, 0xc00061a1a0?}, {0xc000990800, 0x34, 0x40}, {0x3677360, 0x0, ...}, ...)
/home/runner/go/pkg/mod/github.com/redhat-cop/operator-utils@v1.3.7/pkg/util/lockedresourcecontroller/enforcing-reconciler.go:117 +0x3bc
github.com/redhat-cop/operator-utils/pkg/util/lockedresourcecontroller.(*EnforcingReconciler).UpdateLockedResources(...)
/home/runner/go/pkg/mod/github.com/redhat-cop/operator-utils@v1.3.7/pkg/util/lockedresourcecontroller/enforcing-reconciler.go:91
github.com/redhat-cop/namespace-configuration-operator/controllers.(*NamespaceConfigReconciler).Reconcile(0xc0000f0f20, {0x27c5e68, 0xc00075e660}, {{{0x0, 0x0}, {0xc00025c880, 0x1f}}})
/home/runner/work/namespace-configuration-operator/namespace-configuration-operator/controllers/namespaceconfig_controller.go:121 +0x83c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x27c5e68?, {0x27c5e68?, 0xc00075e660?}, {{{0x0?, 0x21c61c0?}, {0xc00025c880?, 0x27b66d0?}}})
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.2/pkg/internal/controller/controller.go:118 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00027d900, {0x27c5ea0, 0xc0003f2690}, {0x233bc20?, 0xc0000626c0?})
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.2/pkg/internal/controller/controller.go:314 +0x365
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00027d900, {0x27c5ea0, 0xc0003f2690})
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.2/pkg/internal/controller/controller.go:265 +0x1c9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.2/pkg/internal/controller/controller.go:226 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 70
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.2/pkg/internal/controller/controller.go:222 +0x565

Answer 2 · 2023-11-22T11:06:07.000Z

With this issue, we were receiving constant prometheus TargetDown alerts, because new operator version 1.2.5 cannot start, there is a panic, and Openshift has a generic alert called TargetDown that check that operator metrics endpoint is alive, which is not the case, impacting all clusters.

Until there is a fix on a newer version, I have mitigated the issue by now:

I disabled ArgoCD autosync
Uninstall failed operator version 1.2.5
Forcing old operator version 1.2.4, with Manual approval. That way, once manually accepted and installed correct 1.2.4, it won't jump automatically to failed 1.2.5 because required manual approval

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "-1"
  name: namespace-configuration-operator
spec:
  channel: alpha
  installPlanApproval: Manual
  startingCSV: namespace-configuration-operator.v1.2.4
  name: namespace-configuration-operator
  source: community-operators
  sourceNamespace: openshift-marketplace

Answer 3 · 2023-12-21T09:21:21.000Z

seems to be fixed with v1.2.6
#169

/cc @slopezz @raffaelespazzoli