0.12.3 -> 0.13.0-rc1 upgrade. Workloads fails to start due to pod security policy issue

Question

0.12.3 -> 0.13.0-rc1 upgrade. Workloads fails to start due to pod security policy issue

paalkr opened this issue 5 years ago · 7 comments

Workloads deployed to a 0.12.3 kube-aws cluster (kubernetes v 1.12.4) does not work after the cluster is updated to kube-aws 0.13.0-rc1 (kubernetes v 1.13.5).

The grafana container deployed with helm using the prometheus-operator chart does complain about apparmor not running in the node.

Labels:             app=grafana
                    pod-template-hash=6cd75fff6b
                    release=monitoring
Annotations:        checksum/config: 112d1de8efd11e546e384adb09cee5b5a81448bfbdb77bbe096ba9fa9e0f5b85
                    checksum/dashboards-json-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
                    checksum/sc-dashboard-provider-config: 4a8da5e1302c610a2d3a86c4fb1135ee01095d9a321d918a698c29628622aa8f
                    checksum/secret: 12f8d0d2360aed7f5a689fce0bb74a98d4f6a118a3b7ff2b5cec9e9a43cce703
                    container.apparmor.security.beta.kubernetes.io/grafana: runtime/default
                    container.apparmor.security.beta.kubernetes.io/grafana-sc-dashboard: runtime/default
                    container.apparmor.security.beta.kubernetes.io/grafana-sc-datasources: runtime/default
                    kubernetes.io/psp: monitoring-grafana
                    seccomp.security.alpha.kubernetes.io/pod: docker/default
Status:             Pending
Reason:             AppArmor
Message:            Cannot enforce AppArmor: AppArmor is not enabled on the host

The error I get in the replica set for all pods not running in the kube-system namespace is.
Error creating: pods "<deployment_name>-<hash>-" is forbidden: unable to validate against any pod security policy: []

I imagine this case should be handled by the 00-kube-aws-permissive psp, as described in
#1589

Discussion on slack
https://kubernetes.slack.com/messages/C5GP8LPEC/convo/C5GP8LPEC-1558389645.080100/

Answer 1 · 2019-05-21T14:55:33.000Z

The problem is that @paalkr has existing PodSecurityPolicies in his cluster - so we don't automatically map all service accounts, users and nodes to our permissive policy. We only do that when there are no existing policies. @paalkr I suggest that you either create a new PodSecurityPolicy and map it to the service accounts/namespaces/users you want to allow. Or use a ClusterRoleBinding to map them to our 00-kube-aws-permissive policy.

Answer 2 · 2019-05-21T15:09:26.000Z

Updated release note

Answer 3 · 2019-05-21T18:59:38.000Z

Thanks for clarifying. I guess our best option to update our ClusterRoleBindings

Answer 4 · 2019-05-21T19:39:33.000Z

So my quick and dirty fix to make sure that the updated cluster functions the same ways as before the upgrade, is to manually deploy the kube-aws:permissive-psp-cluster-wide ClusterRoleBinding after updating the control-plane. For new clusters we will start to use proper Pod Security Policies

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kube-aws:permissive-psp-cluster-wide
roleRef:
  kind: ClusterRole
  name: kube-aws:permissive-psp
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
  name: system:serviceaccounts
- kind: Group
  name: system:autheniticated

Answer 5 · 2019-05-21T20:48:31.000Z

I'm closing this issue because I manage to work around the problem by manually deploy the permissive ClusterRoleBinding. I understand it's hard to fully automate this though, but I wonder if and how we might make a better upgrade experience

Answer 6 · 2019-05-21T21:30:08.000Z

I imagine you could do something like this as well

controller:
  customFiles:
    - path: "/srv/kubernetes/manifests/custom/permissive-psp.yaml"
      permissions: 0644
      content: |
        apiVersion: rbac.authorization.k8s.io/v1
        kind: ClusterRoleBinding
        metadata:
          name: kube-aws:permissive-psp-cluster-wide
        roleRef:
          kind: ClusterRole
          name: kube-aws:permissive-psp
          apiGroup: rbac.authorization.k8s.io
        subjects:
        - kind: Group
          name: system:serviceaccounts
        - kind: Group
          name: system:autheniticated

Answer 7 · 2019-05-21T21:40:50.000Z

Yup, that worked!