Azure/aks-periscope

pods are not being scheduled on open shift clusters for the aks periscope namespace

Closed this issue ยท 7 comments

Describe the bug
When we try to deploy the periscope daemonset on to the open shift cluster, you end up with this error: Error creating: pods "aks-periscope-" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[0].securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used]

To Reproduce

  1. Spin up an open shift cluster: https://docs.microsoft.com/en-us/azure/openshift/howto-create-private-cluster-4x#:~:text=Create%20an%20Azure%20Red%20Hat%20OpenShift%204%20private,the%20OpenShift%20CLI.%20...%206%20Next%20steps.%20
  2. Then follow the instructions in the periscope appendix and try to deploy the daemon set onto the openshift cluster.
  3. Wait 3 minutes and to view the error, run kubectl get events -n aks-periscope. You can also run kubectl get daemonsets -n aks-periscope and you will see that the daemonset will not be in ready state.

Expected behavior
Pods should be in running state after deploying the periscope yaml onto the open shift cluster.

Desktop:

  • OS: WSL or unix 20.04

๐Ÿ““ Thanks for opening this @sophsoph321 , please note periscope currently support AKS cluster only and not support other kind clusters.

Was it working before? Thanks ๐Ÿ™

update:

Idea / thought share: this is a good issue where if this tool decide supporting the wider kind cluster we should then include or atleast shope for open shift kind cluster.

Hi @sophsoph321, thanks for testing this.

It looks like ARO (Azure Redhat Openshift) clusters have different permissions by default (more restrictive), from your post there are 3 different errors reported for default periscope config:

  1. Privileged containers are not allowed
  2. Host PID is not allowed to be used
  3. hostPath volumes are not allowed to be used

This matches what I would expect based on what is reported at the following link under the section: "Restricted SCC: The Most Secure Standard Choice": https://www.openshift.com/blog/managing-sccs-in-openshift

When a pod is created without explicitly using the PodSecurityContext field or the SecurityContext field under the container specifications it will use the Restricted SCC by default.

I believe we will need to configure a Security Context Constraint (SCC) to deploy to ARO. The instructions at that link above show ways of creating it via the "oc" command line tool, assume there are equivalent definitions for baking it into yaml.

Feel free to ping me on teams if this doesn't make sense or you get stuck, but I imagine there should be documented ways of permitting (1) (2) and (3) using an SCC in the ARO docs. Or if you prefer we could try to reach out to someone in the ARO team directly?

Oh to get started - maybe its as simple as: oc apply -f aks-periscope.yaml --as=Privileged

๐Ÿ““ Thanks for opening this @sophsoph321 , please note periscope currently support AKS cluster only and not support other kind clusters.

Was it working before? Thanks ๐Ÿ™

update:

Idea / thought share: this is a good issue where if this tool decide supporting the wider kind cluster we should then include or atleast shope for open shift kind cluster.

@Tatsinnit, totally understand that periscope supports AKS cluster only. However, I discussed internally with the arc team and we need periscope to work on OpenShift for the MVP of our troubleshooting tool. The reason is that OpenShift is one of the distros most used amongst our customers and that's the distro on which more things are likely to go wrong. Thus, making our troubleshooting tool more necessary there.
Anyways, I was able to get the periscope pods to schedule.

Oh to get started - maybe its as simple as: oc apply -f aks-periscope.yaml --as=Privileged

Thank you @davidkydd. I was able to get the pods to schedule by running oc adm policy add-scc-to-user privileged system:serviceaccount::, just FYI. Will close this issue for now and will communicate over mail/Teams if there's any code changes needed for periscope to work with OpenShift.

That's great Sophie! Thanks for confirming the exact steps too.

Worth mentioning that I am very grateful for the excellent R&D work you have been doing to test and develop Periscope across new distros and platforms: you have contributed greatly to improving the capabilities and coverage of the tool and have been a joy to collaborate with ๐Ÿ˜ƒ ๐Ÿ’ฏ ๐Ÿฅ‡

๐Ÿ““ Thanks for opening this @sophsoph321 , please note periscope currently support AKS cluster only and not support other kind clusters.
Was it working before? Thanks ๐Ÿ™
update:
Idea / thought share: this is a good issue where if this tool decide supporting the wider kind cluster we should then include or atleast shope for open shift kind cluster.

@Tatsinnit, totally understand that periscope supports AKS cluster only. However, I discussed internally with the arc team and we need periscope to work for the MVP of our troubleshooting tool. The reason is that OpenShift is one of the distros most used amongst our customers and that's the distro on which more things are likely to go wrong. Thus, making our troubleshooting tool more necessary there.
Anyways, I was able to get the periscope pods to schedule.

Thanks Sophie, yeah sounds like a plan, and you could either add PR or if you have any thoughts feel free to add workitem for the project created here https://github.com/Azure/aks-periscope/projects/2 , thank you ๐Ÿ™