redhat-cop/namespace-configuration-operator

Best way to deploy ~60 Networkpolicies?

ichmachnixichgucknur opened this issue · 11 comments

Hi,
we are requested to apply approx ~60 configurations (networkPolicies) to ~180 Namespaces (counting).
I created a file for every config and applied this. That works, but the operator takes approx 7GB of RAM.

I thought about optimizing this and to create a single file. On the testsystem, the RAM Usage is much lower and for the first config ist works. But by applying an update everything breaks. The message is:
ERROR controller-runtime.manager.controller.namespaceconfig Reconciler error {"reconciler group": "redhatcop.redhat.io", "reconciler kind": "NamespaceConfig", "name": "netpol-rules", "namespace": "", "error": "Request entity too large: limit is 3145728"}
So this does not work.

What is the recommended way to apply such a configuration?

Thx
Ronny

the memory consumed by the operator if roughly proportional to the number of namespaceconfig.
Just to understand you have 60 different network policies that need to be applied in different combinations to 180 namespaces?
can you do some consolidation? A namespace config can carry an array of dis-homogeneous objects.

I have 60 networkpolicies applied to namespaces matching one label (matching 99% of the cluster).
Working with 60 namespaceconfigs works, but ends in a huge memory consumption - as you mentioned proportional to the configs.
I consolidated the config to one namespaceconfig including all the networkpolicies - I think that is what you are talking about. Initially this works, but on an update on the namespaceconfig I get the error I mentioned. So this does not work.

got it, I now see what the problem is. probably the status field is so big that the update does not work. let me think what I can do in that space....

@cnuland @trevorbox I think the solution we should adopt here is to filter the successful reconcilers from the status and display only the ones that are erroring out. This would drastically reduce the size of the etcd object.
I'm currently busy and cannot work on this, would you be able to make this fix?

@ichmachnixichgucknur would you be able to build and test this branch: https://github.com/raffaelespazzoli/namespace-configuration-operator/tree/fix%23122
It should contain the fix to this problem.

@raffaelespazzoli I will test this and come back to you.

@raffaelespazzoli sorry, but this does not work :(
2021-10-13T10:16:49.592Z ERROR controller-runtime.manager.controller.namespaceconfig Reconciler error {"reconciler group": "redhatcop.redhat.io", "reconciler kind": "NamespaceConfig", "name": "netpol-rules", "namespace": "", "error": "rpc error: code = ResourceExhausted desc = trying to send message larger than max (2340792 vs. 2097152)"}
What about a switch to disable status updated completely?

mm ok, one of the issue is that this is a hard problem to reproduce. is there any chance that I can contact you in private and you give me access to your cluster?

Access to that cluster is impossible (on premise). But I can give you the information on how I tested this and the corresponding yamls. Would that be ok? I‘d just need a contact address.

Thanks, I’ll provide you a package with details tomorrow.