emqx/emqx-operator

AWS EKS Fargate Deployment error mkdir: cannot create directory ‘/opt/emqx/data/configs’: Permission denied

weidadedawei opened this issue · 2 comments

环境

AWS 的 EKS集群使用Fargate 为Pod,配置EFS挂载

  • EMQX 版本:
    5.5.0

重现此问题的步骤

  1. 执行emqx.yaml 文件
apiVersion: apps.emqx.io/v2beta1
kind: EMQX
metadata:
  namespace: emqx
  name: emqx
spec:
  image: emqx:5.5.0
  coreTemplate:
    spec:
      ## 若开启了持久化,您需要配置 podSecurityContext,
      ## 详情请参考 discussion: https://github.com/emqx/emqx-operator/discussions/716
      podSecurityContext:
        runAsUser: 1000
        runAsGroup: 1000
        runAsNonRoot: true
        fsGroup: 1000
        fsGroupChangePolicy: Always
        supplementalGroups:
          - 1000
      containerSecurityContext:
        runAsNonRoot: true
        runAsUser: 1000
        runAsGroup: 1000
      ## EMQX 自定义资源不支持在运行时更新这个字段
      volumeClaimTemplates:
        ## 更多内容:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/storage-classes.html
        ## 请将 Amazon EBS CSI 驱动程序作为 Amazon EKS 附加组件管理,
        ## 更多文档请参考:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/managing-ebs-csi.html
        storageClassName: efs-sc
        resources:
         requests:
           storage: 10Gi
        accessModes:
         - ReadWriteOnce
  dashboardServiceTemplate:
    metadata:
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/
      annotations:
        ## 指定 NLB 是面向 Internet 的还是内部的。如果未指定,则默认为内部。
        service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
        ## 指定 NLB 将流量路由到的可用区。指定至少一个子网,subnetID 或 subnetName(子网名称标签)都可以使用。
        service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-0381ae34992d0ab0a,subnet-0d9c7b3a6b0049810
    spec:
      type: LoadBalancer
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/nlb/
      loadBalancerClass: service.k8s.aws/nlb
  listenersServiceTemplate:
    metadata:
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/
      annotations:
        ## 指定 NLB 是面向 Internet 的还是内部的。如果未指定,则默认为内部。
        service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
        ## 指定 NLB 将流量路由到的可用区。指定至少一个子网,subnetID 或 subnetName(子网名称标签)都可以使用。
        service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-0381ae34992d0ab0a,subnet-0d9c7b3a6b0049810
    spec:
      type: LoadBalancer
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/nlb/
      loadBalancerClass: service.k8s.aws/nlb

2.后台关联PVC
image

  1. 启动失败

image

日志报错:
mkdir: cannot create directory ‘/opt/emqx/data/configs’: Permission denied

Pod 的yaml内容

---
metadata:
  annotations:
    kubectl.kubernetes.io/restartedAt: '2024-02-19T16:17:38+08:00'
  generateName: emqx-core-5d8fc69f48-
  labels:
    apps.emqx.io/db-role: core
    apps.emqx.io/instance: emqx
    apps.emqx.io/managed-by: emqx-operator
    apps.emqx.io/pod-template-hash: 5d8fc69f48
    apps.kubernetes.io/pod-index: '0'
    controller-revision-hash: emqx-core-5d8fc69f48-6b9c7fd856
    eks.amazonaws.com/fargate-profile: EMQX
    statefulset.kubernetes.io/pod-name: emqx-core-5d8fc69f48-0
  name: emqx-core-5d8fc69f48-0
  namespace: emqx
  ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: StatefulSet
      name: emqx-core-5d8fc69f48
      uid: b2bb7fea-da05-4857-b1f8-a91c58f966b2
  resourceVersion: '57538361'
spec:
  containers:
    - env:
        - name: EMQX_DASHBOARD__LISTENERS__HTTP__BIND
          value: '18083'
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: EMQX_CLUSTER__DISCOVERY_STRATEGY
          value: dns
        - name: EMQX_CLUSTER__DNS__RECORD_TYPE
          value: srv
        - name: EMQX_CLUSTER__DNS__NAME
          value: emqx-headless.emqx.svc.cluster.local
        - name: EMQX_HOST
          value: $(POD_NAME).$(EMQX_CLUSTER__DNS__NAME)
        - name: EMQX_NODE__DATA_DIR
          value: data
        - name: EMQX_NODE__ROLE
          value: core
        - name: EMQX_NODE__COOKIE
          valueFrom:
            secretKeyRef:
              key: node_cookie
              name: emqx-node-cookie
        - name: EMQX_API_KEY__BOOTSTRAP_FILE
          value: '"/opt/emqx/data/bootstrap_api_key"'
      image: 'emqx:5.5.0'
      imagePullPolicy: IfNotPresent
      livenessProbe:
        failureThreshold: 3
        httpGet:
          path: /status
          port: dashboard
          scheme: HTTP
        initialDelaySeconds: 60
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 1
      name: emqx
      ports:
        - containerPort: 18083
          name: dashboard
          protocol: TCP
      readinessProbe:
        failureThreshold: 12
        httpGet:
          path: /status
          port: dashboard
          scheme: HTTP
        initialDelaySeconds: 10
        periodSeconds: 5
        successThreshold: 1
        timeoutSeconds: 1
      resources: {}
      securityContext:
        runAsGroup: 1000
        runAsNonRoot: true
        runAsUser: 1000
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /opt/emqx/data/bootstrap_api_key
          name: bootstrap-api-key
          readOnly: true
          subPath: bootstrap_api_key
        - mountPath: /opt/emqx/etc/emqx.conf
          name: bootstrap-config
          readOnly: true
          subPath: emqx.conf
        - mountPath: /opt/emqx/log
          name: emqx-core-log
        - mountPath: /opt/emqx/data
          name: emqx-core-data
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-7d9xr
          readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostname: emqx-core-5d8fc69f48-0
  nodeName: fargate-ip-192-168-249-23.us-west-1.compute.internal
  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  readinessGates:
    - conditionType: apps.emqx.io/on-serving
  restartPolicy: Always
  schedulerName: fargate-scheduler
  securityContext:
    fsGroup: 1000
    fsGroupChangePolicy: Always
    runAsGroup: 1000
    runAsNonRoot: true
    runAsUser: 1000
    supplementalGroups:
      - 1000
  serviceAccount: default
  serviceAccountName: default
  subdomain: emqx-headless
  terminationGracePeriodSeconds: 30
  tolerations:
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
      tolerationSeconds: 300
  volumes:
    - name: emqx-core-data
      persistentVolumeClaim:
        claimName: emqx-core-data-emqx-core-5d8fc69f48-0
    - name: bootstrap-api-key
      secret:
        defaultMode: 420
        secretName: emqx-bootstrap-api-key
    - configMap:
        defaultMode: 420
        name: emqx-configs
      name: bootstrap-config
    - emptyDir: {}
      name: emqx-core-log
    - name: kube-api-access-7d9xr
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              items:
                - key: ca.crt
                  path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace


预期行为

应该正常启动

实际行为

EFS文件下 没有创建 /opt/emqx/data/configs ,但是存在 bootstrap_api_key 文件
image

Could you please use AWS EBS retry it, I guess AWS EFS can not support securityContext.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.