AWS EKS Fargate 部署报错mkdir: cannot create directory ‘/opt/emqx/data/configs’: Permission denied

环境

AWS 的 EKS集群使用Fargate 为Pod,配置EFS挂载

  • EMQX 版本:
    5.5.0
  • 操作系统版本:
    Ubuntu22.04

重现此问题的步骤

  1. 执行emqx.yaml 文件
apiVersion: apps.emqx.io/v2beta1
kind: EMQX
metadata:
  namespace: emqx
  name: emqx
spec:
  image: emqx:5.5.0
  coreTemplate:
    spec:
      ## 若开启了持久化,您需要配置 podSecurityContext,
      ## 详情请参考 discussion: https://github.com/emqx/emqx-operator/discussions/716
      podSecurityContext:
        runAsUser: 1000
        runAsGroup: 1000
        runAsNonRoot: true
        fsGroup: 1000
        fsGroupChangePolicy: Always
        supplementalGroups:
          - 1000
      containerSecurityContext:
        runAsNonRoot: true
        runAsUser: 1000
        runAsGroup: 1000
      ## EMQX 自定义资源不支持在运行时更新这个字段
      volumeClaimTemplates:
        ## 更多内容:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/storage-classes.html
        ## 请将 Amazon EBS CSI 驱动程序作为 Amazon EKS 附加组件管理,
        ## 更多文档请参考:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/managing-ebs-csi.html
        storageClassName: efs-sc
        resources:
         requests:
           storage: 10Gi
        accessModes:
         - ReadWriteOnce
  dashboardServiceTemplate:
    metadata:
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/
      annotations:
        ## 指定 NLB 是面向 Internet 的还是内部的。如果未指定,则默认为内部。
        service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
        ## 指定 NLB 将流量路由到的可用区。指定至少一个子网,subnetID 或 subnetName(子网名称标签)都可以使用。
        service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-0381ae34992d0ab0a,subnet-0d9c7b3a6b0049810
    spec:
      type: LoadBalancer
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/nlb/
      loadBalancerClass: service.k8s.aws/nlb
  listenersServiceTemplate:
    metadata:
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/
      annotations:
        ## 指定 NLB 是面向 Internet 的还是内部的。如果未指定,则默认为内部。
        service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
        ## 指定 NLB 将流量路由到的可用区。指定至少一个子网,subnetID 或 subnetName(子网名称标签)都可以使用。
        service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-0381ae34992d0ab0a,subnet-0d9c7b3a6b0049810
    spec:
      type: LoadBalancer
      ## 更多内容:https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/nlb/
      loadBalancerClass: service.k8s.aws/nlb

2.后台关联PVC

  1. 启动失败

日志报错:
mkdir: cannot create directory ‘/opt/emqx/data/configs’: Permission denied

Pod 的yaml内容

---
metadata:
  annotations:
    kubectl.kubernetes.io/restartedAt: '2024-02-19T16:17:38+08:00'
  generateName: emqx-core-5d8fc69f48-
  labels:
    apps.emqx.io/db-role: core
    apps.emqx.io/instance: emqx
    apps.emqx.io/managed-by: emqx-operator
    apps.emqx.io/pod-template-hash: 5d8fc69f48
    apps.kubernetes.io/pod-index: '0'
    controller-revision-hash: emqx-core-5d8fc69f48-6b9c7fd856
    eks.amazonaws.com/fargate-profile: EMQX
    statefulset.kubernetes.io/pod-name: emqx-core-5d8fc69f48-0
  name: emqx-core-5d8fc69f48-0
  namespace: emqx
  ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: StatefulSet
      name: emqx-core-5d8fc69f48
      uid: b2bb7fea-da05-4857-b1f8-a91c58f966b2
  resourceVersion: '57538361'
spec:
  containers:
    - env:
        - name: EMQX_DASHBOARD__LISTENERS__HTTP__BIND
          value: '18083'
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: EMQX_CLUSTER__DISCOVERY_STRATEGY
          value: dns
        - name: EMQX_CLUSTER__DNS__RECORD_TYPE
          value: srv
        - name: EMQX_CLUSTER__DNS__NAME
          value: emqx-headless.emqx.svc.cluster.local
        - name: EMQX_HOST
          value: $(POD_NAME).$(EMQX_CLUSTER__DNS__NAME)
        - name: EMQX_NODE__DATA_DIR
          value: data
        - name: EMQX_NODE__ROLE
          value: core
        - name: EMQX_NODE__COOKIE
          valueFrom:
            secretKeyRef:
              key: node_cookie
              name: emqx-node-cookie
        - name: EMQX_API_KEY__BOOTSTRAP_FILE
          value: '"/opt/emqx/data/bootstrap_api_key"'
      image: 'emqx:5.5.0'
      imagePullPolicy: IfNotPresent
      livenessProbe:
        failureThreshold: 3
        httpGet:
          path: /status
          port: dashboard
          scheme: HTTP
        initialDelaySeconds: 60
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 1
      name: emqx
      ports:
        - containerPort: 18083
          name: dashboard
          protocol: TCP
      readinessProbe:
        failureThreshold: 12
        httpGet:
          path: /status
          port: dashboard
          scheme: HTTP
        initialDelaySeconds: 10
        periodSeconds: 5
        successThreshold: 1
        timeoutSeconds: 1
      resources: {}
      securityContext:
        runAsGroup: 1000
        runAsNonRoot: true
        runAsUser: 1000
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /opt/emqx/data/bootstrap_api_key
          name: bootstrap-api-key
          readOnly: true
          subPath: bootstrap_api_key
        - mountPath: /opt/emqx/etc/emqx.conf
          name: bootstrap-config
          readOnly: true
          subPath: emqx.conf
        - mountPath: /opt/emqx/log
          name: emqx-core-log
        - mountPath: /opt/emqx/data
          name: emqx-core-data
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-7d9xr
          readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostname: emqx-core-5d8fc69f48-0
  nodeName: fargate-ip-192-168-249-23.us-west-1.compute.internal
  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  readinessGates:
    - conditionType: apps.emqx.io/on-serving
  restartPolicy: Always
  schedulerName: fargate-scheduler
  securityContext:
    fsGroup: 1000
    fsGroupChangePolicy: Always
    runAsGroup: 1000
    runAsNonRoot: true
    runAsUser: 1000
    supplementalGroups:
      - 1000
  serviceAccount: default
  serviceAccountName: default
  subdomain: emqx-headless
  terminationGracePeriodSeconds: 30
  tolerations:
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
      tolerationSeconds: 300
  volumes:
    - name: emqx-core-data
      persistentVolumeClaim:
        claimName: emqx-core-data-emqx-core-5d8fc69f48-0
    - name: bootstrap-api-key
      secret:
        defaultMode: 420
        secretName: emqx-bootstrap-api-key
    - configMap:
        defaultMode: 420
        name: emqx-configs
      name: bootstrap-config
    - emptyDir: {}
      name: emqx-core-log
    - name: kube-api-access-7d9xr
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              items:
                - key: ca.crt
                  path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace


预期行为

应该正常启动

实际行为

EFS文件下 没有创建 /opt/emqx/data/configs ,但是存在 bootstrap_api_key 文件
image

你可以尝试使用 AWS EBS 作为 storage class 试试么?我们怀疑 AWS EFS 不支持 securityContext ,另外 AWS EFS 更适合储存 “不频繁访问的文件资源”,在 EMQX 的场景中并不适用。
REF: Amazon EFS 不频繁访问 | Amazon Web Services


我是使用的EKS + Fargate,无法将 Amazon EBS 卷挂载到 Fargate Pods