How can I disable mutation for managed sidecars
BitRacer opened this issue · 14 comments
I'm using a managed version of service mesh in GKE, Anthos Service Mesh. The managed service mesh injects envoy sidecars and references the containers by tag, not with a sha. The result of mutating the sha on the sidecar is that ASM cannot determine what version is installed and chaos ensues.
Is there a way to disable mutation for a container, or set of containers. for example gcr.io/releases/asm*
I only want to skip this mutation for theses sidecars. Not for the main container, which is signed and delivered with a sha
I can't find any examples of how to set this in the cluster policy or mutating webhook config.
@BitRacer I wouldn't recommend to disable the mutation for a container. In addition to that, I won't recommend it as a security best practice consists on relying on digests instead of tags.
It is not possible today to disable it with the existing source code. Feel free to make any changes to that logic.
It would be nice to be able to disable the mutation, it is more obvious what version of image is running on the cluster
There are container registries that support immutable tags anyway, that could resolve the security concern of using tags.
cosign support verify using image with Tag, as during cosign verify cosign converts tag to digest before doing the verification, so i dont see really the benefit of doing the mutation in policy controller to modify IMAG:TAG -> IMAGE@sha256 as with this in podsepcs we loose the important TAG information, it would be nice if policy controller does not do this mutation and let cosign library does it. do we have an specific reason why we do this in policy controller before calling cosign verify
/assign
Mu suggestion would be to have a flag -disableMutation similar to what we have for disabling TUF (-disableTUF) so if this flag is used than the logic for Mutating TAG to Digest does not happen. so based on the user requirement we can choose to have mutation of the tag to digest or not , currently it is hardcoded in the code.
more over as i commented before cosign library support verify based on TAG so when policy controller invokes cosign verify function let say with TAG as IMAGE reference , cosign will anyway first convert it to digest and then only get the associated signature, so putting this logic in policy controller to always invoke cosign verify function with digest does not bring any additional security. This enhancement will allow user to configure policy controller without mutation in case they need to do so.
Hi @BitRacer , what do you mean by "chaos ensued". I'm also using Anthos Service Mesh and I see no side effect for now. Maybe when updating Istio?
@hectorj2f - could you please share your thoughts?
As mentioned above, disabling the mutation of sidecars won't follow security best practices.
However, you could try using policy rules which match specific labels so the policy-controller only enforce these policies on specific pods or other type of resources. That could exclude the sidecars pods of istio although I haven't tried myself.
Thank Hector for your comment.
The intent here is not to disable that by default but to have option to do that from the chart as we have disable-tuf for example.
We are facing issue when there is replicaset, we can see that K8S get crazy and it's keep continue creating resources infinite.
We've tried this "Match" functinailty that only pods would be in the scope and in a way to pass everything beside that, but seems to be only instrumenting the Validation webhook, not the Mutating one.
We've found as issue with this. Scenario is the following:
A deployment already deployed into a k8s cluster, so mutation not done on it yet.
Policy controller enabled later only. We deployed it later, but it can happen that validation simply turned on later for the given namespace.
Scaling the deployment cause that infinite number of replicasets are generated because of the mutation. As a consequence pods will never be ready after it.
What we've found that replicaset created based on the deployment, replicaset mutated so controller manager found there is difference there, so dropped the replicaset and created a new, which also had a difference and so on...
Is there any issue which would deal with this behavior ?
Btw Connaisseur also supports the different scope for mutation:
https://github.com/sse-secure-systems/connaisseur/releases/tag/v3.4.0
After a quick search it seems kyverno also does the mutation and validation on the pod level:
https://kyverno.io/docs/writing-policies/verify-images/sigstore/#certificate-based-signing-and-verification
#1105 (comment)
@hectorj2f, what if the mutating webhook would respect the match?