hashicorp/consul

Consul image can not be started on Kubernetes/Openshift without mounted volume

fedinskiy opened this issue · 0 comments

Overview of the Issue

When official Consul docker image is started on Kubernetes without mounted volume, it fails with either su-exec: setgroups(1000): Operation not permitted or failed to write NodeID to disk error.

Reproduction Steps

Steps for Openshift, steps for K8s should be similar:

  1. Login into OpenShift
  2. Create new project: oc new-project ts-consul
  3. Create file consul.yml with following content:
---
apiVersion: "v1"
kind: "List"
items:
- apiVersion: "v1"
  kind: "Service"
  metadata:
    name: "consul"
  spec:
    ports:
    - name: "http"
      port: 8500
      targetPort: 8500
    selector:
      deployment: "consul"
    type: "ClusterIP"
- apiVersion: "apps/v1"
  kind: "Deployment"
  metadata:
    name: "consul"
  spec:
    replicas: 1
    selector:
      matchLabels:
        deployment: "consul"
    template:
      metadata:
        labels:
          deployment: "consul"
      spec:
        containers:
        - image: "docker.io/hashicorp/consul:1.19"
#          env:
#          - name: "CONSUL_DISABLE_PERM_MGMT"
#            value: "yes"
          imagePullPolicy: "IfNotPresent"
          name: "consul"
          ports:
          - containerPort: 8500
            name: "http"
            protocol: "TCP"

  1. Deploy the container: oc apply -f consul.yml -n ts-consul
  2. Start the container: oc scale deployemnt/consul --replicas=1 -n ts-consul
  3. Wait for several seconds and check status:
$ oc get pods
NAME                      READY   STATUS             RESTARTS      AGE
consul-6b486f7bfc-kjcd4   0/1     CrashLoopBackOff   3 (15s ago)   56s
  1. Check pod logs: oc logs pod/consul-6b486f7bfc-kjcd4 (replace with the id of your pod): `su-exec: setgroups(1000): Operation not permitted

Alternative solution

We can follow the solution, implemented in hashicorp/docker-consul#103 and add CONSUL_DISABLE_PERM_MGMT property. Unfortunately, this will just lead to a different error:

 failed to setup node ID: failed to write NodeID to disk: open /consul/data/node-id: permission denied

Consul info for both Client and Server

We use official docker container docker.io/hashicorp/consul:1.19

Operating system and Environment details

OC: 6.10.8-200.fc40.x86_64
Openshift version:

Client Version: 4.16.10
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Kubernetes Version: v1.29.7+4510e9c

Similar error was previously described several times:

  1. #4172 — suggested solution is to use custom Docker image
  2. hashicorp/docker-consul#103 — added CONSUL_DISABLE_PERM_MGMT environment property, not helpful in this case, see "Alternative solution" section)
  3. #10403 —(recommended solution is to check "mount parameters", but that requires volume mounting, which would be overkill in some cases(e.g/ training or integration testing).

Usage of bitnami/consul image can be considered a workaround, but it comes with its own challenges[1] so it is preferable to have this issue solved for the official image.

Was earlier reported in this repo (#12882) and in the docker-consul one (hashicorp/docker-consul#184)

[1] bitnami-labs/sealed-secrets#822