Fix race condition in IngressNodeFirewallController
andreaskaris opened this issue · 2 comments
andreaskaris commented
This happens every few times when running the unit tests. The unit tests create and delete resources in short intervals one after another, so this is definitely a race condition which will need to be ironed out. Below, we have a leftover nodestate resource from a previous test which somehow isn't cleaned up correctly:
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 does not exist, creating (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 successfully created (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 update was successful
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 update was successful
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 update was successful
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 update was successful
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 update was successful
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
2022/08/10 15:02:26 update was successful
2022/08/10 15:02:26 reconciling (apps/v1, Kind=DaemonSet) ingress-node-fw-config-test-namespace/ingress-node-firewall-daemon
•••••••••
------------------------------
• Failure [1.084 seconds]
IngressNodeFirewall controller rules
/home/akaris/development/ingress-node-firewall/controllers/ingressnodefirewall_controller_rules_test.go:20
when IngressNodeFirewall objects are created for test case: "baseline test without merging"
/home/akaris/development/ingress-node-firewall/controllers/ingressnodefirewall_controller_rules_test.go:780
The resulting IngressNodeFirewallNodeState object should look as expected [It]
/home/akaris/development/ingress-node-firewall/controllers/ingressnodefirewall_controller_rules_test.go:781
Timed out after 1.003s.
Expected
<bool>: false
to be true
/home/akaris/development/ingress-node-firewall/controllers/ingressnodefirewall_controller_test.go:343
------------------------------
STEP: Creating node worker-0
STEP: Creating new IngressNodeFirewall objects
By creating new IngressNodeFirewall object firewall-01.6601365479200597e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewallNodeState objects in namespace {"req.Name": "firewall-0", "r.Namespace": "ingress-node-fw-config-test-namespace"}
STEP: Waiting for the expected list of IngressNodeFirewallNodeStates
1.6601365479258146e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewall objects {"req.Name": "firewall-0"}
Could not find the desired number of IngressNodeFirewallNodeStates. Found 0 objects but expected to find 1 objects. Object list: []
1.660136547931877e+09 INFO controllers.IngressNodeFirewall Building the desired node state specs {"req.Name": "firewall-0"}
1.6601365479352949e+09 INFO controllers.IngressNodeFirewall Object node found, triggering creation {"req.Name": "firewall-0", "nodeToCreate": "worker-daemon"}
Could not find the desired number of IngressNodeFirewallNodeStates. Found 0 objects but expected to find 1 objects. Object list: []
1.660136547943595e+09 INFO controllers.IngressNodeFirewall Reconciling resource and programming bpf {"name": "worker-daemon", "namespace": "ingress-node-fw-config-test-namespace"}
1.6601365479459715e+09 INFO controllers.IngressNodeFirewall Created object {"req.Name": "firewall-0", "ingressNodeFirewallNodeState.Namespace": "ingress-node-fw-config-test-namespace", "ingressNodeFirewallNodeState.Name": "worker-daemon"}
1.6601365479459975e+09 INFO controllers.IngressNodeFirewall Object node found, triggering creation {"req.Name": "firewall-0", "nodeToCreate": "worker-0"}
1.6601365479492908e+09 INFO controllers.IngressNodeFirewall Reconciling resource and programming bpf {"name": "worker-daemon", "namespace": "ingress-node-fw-config-test-namespace"}
Could not find the desired number of IngressNodeFirewallNodeStates. Found 2 objects but expected to find 1 objects. Object list: [{{IngressNodeFirewallNodeState ingressnodefirewall.openshift.io/v1alpha1} {worker-0 ingress-node-fw-config-test-namespace c2ec61c8-52d0-4a4a-8a1c-f18fd00a90f8 342 1 2022-08-10 15:02:27 +0200 CEST <nil> <nil> map[] map[] [{ingressnodefirewall.openshift.io/v1alpha1 IngressNodeFirewall firewall-0 c2ca64b8-87ae-423e-8e86-b36ab427a70f <nil> <nil>}] [] [{controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"c2ca64b8-87ae-423e-8e86-b36ab427a70f\"}":{}}},"f:spec":{".":{},"f:interfaceIngressRules":{".":{},"f:eth0":{}}}} }]} {map[eth0:[{[10.0.0.0] [{10 {TCP 0xc00056c660 <nil> <nil> <nil> <nil>} Allow}]}]]} { }} {{IngressNodeFirewallNodeState ingressnodefirewall.openshift.io/v1alpha1} {worker-daemon ingress-node-fw-config-test-namespace aa1a09cc-f4a8-4a1e-a14e-a777a6a7c22c 341 1 2022-08-10 15:02:27 +0200 CEST <nil> <nil> map[] map[] [{ingressnodefirewall.openshift.io/v1alpha1 IngressNodeFirewall firewall-0 c2ca64b8-87ae-423e-8e86-b36ab427a70f <nil> <nil>}] [] [{controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"c2ca64b8-87ae-423e-8e86-b36ab427a70f\"}":{}}},"f:spec":{".":{},"f:interfaceIngressRules":{".":{},"f:eth0":{}}}} } {controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:status":{".":{},"f:syncStatus":{}}} status}]} {map[eth0:[{[10.0.0.0] [{10 {TCP 0xc00056c680 <nil> <nil> <nil> <nil>} Allow}]}]]} {Synchronized }}]
1.6601365479552739e+09 INFO controllers.IngressNodeFirewall Created object {"req.Name": "firewall-0", "ingressNodeFirewallNodeState.Namespace": "ingress-node-fw-config-test-namespace", "ingressNodeFirewallNodeState.Name": "worker-0"}
1.6601365479553275e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewallNodeState objects in namespace {"req.Name": "firewall-0", "r.Namespace": "ingress-node-fw-config-test-namespace"}
1.6601365479579825e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewall objects {"req.Name": "firewall-0"}
1.6601365479596775e+09 INFO controllers.IngressNodeFirewall Building the desired node state specs {"req.Name": "firewall-0"}
1.6601365479612281e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewallNodeState objects in namespace {"req.Name": "firewall-0", "r.Namespace": "ingress-node-fw-config-test-namespace"}
1.6601365479645565e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewall objects {"req.Name": "firewall-0"}
Could not find the desired number of IngressNodeFirewallNodeStates. Found 2 objects but expected to find 1 objects. Object list: [{{IngressNodeFirewallNodeState ingressnodefirewall.openshift.io/v1alpha1} {worker-0 ingress-node-fw-config-test-namespace c2ec61c8-52d0-4a4a-8a1c-f18fd00a90f8 343 1 2022-08-10 15:02:27 +0200 CEST <nil> <nil> map[] map[] [{ingressnodefirewall.openshift.io/v1alpha1 IngressNodeFirewall firewall-0 c2ca64b8-87ae-423e-8e86-b36ab427a70f <nil> <nil>}] [] [{controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"c2ca64b8-87ae-423e-8e86-b36ab427a70f\"}":{}}},"f:spec":{".":{},"f:interfaceIngressRules":{".":{},"f:eth0":{}}}} } {controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:status":{".":{},"f:syncStatus":{}}} status}]} {map[eth0:[{[10.0.0.0] [{10 {TCP 0xc00056c960 <nil> <nil> <nil> <nil>} Allow}]}]]} {Synchronized }} {{IngressNodeFirewallNodeState ingressnodefirewall.openshift.io/v1alpha1} {worker-daemon ingress-node-fw-config-test-namespace aa1a09cc-f4a8-4a1e-a14e-a777a6a7c22c 341 1 2022-08-10 15:02:27 +0200 CEST <nil> <nil> map[] map[] [{ingressnodefirewall.openshift.io/v1alpha1 IngressNodeFirewall firewall-0 c2ca64b8-87ae-423e-8e86-b36ab427a70f <nil> <nil>}] [] [{controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"c2ca64b8-87ae-423e-8e86-b36ab427a70f\"}":{}}},"f:spec":{".":{},"f:interfaceIngressRules":{".":{},"f:eth0":{}}}} } {controllers.test Update ingressnodefirewall.openshift.io/v1alpha1 2022-08-10 15:02:27 +0200 CEST FieldsV1 {"f:status":{".":{},"f:syncStatus":{}}} status}]} {map[eth0:[{[10.0.0.0] [{10 {TCP 0xc00056c980 <nil> <nil> <nil> <nil>} Allow}]}]]} {Synchronized }}]
(...)
1.6601365489525127e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewall objects {"req.Name": "firewall-0"}
1.6601365489577942e+09 INFO controllers.IngressNodeFirewall Building the desired node state specs {"req.Name": "firewall-0"}
1.6601365489578543e+09 INFO controllers.IngressNodeFirewall Existing object not found in desired list, triggering deletion {"req.Name": "firewall-0"}
1.6601365489717815e+09 INFO controllers.IngressNodeFirewall Existing object not found in desired list, triggering deletion {"req.Name": "firewall-0"}
1.6601365489837143e+09 INFO controllers.IngressNodeFirewall Getting all IngressNodeFirewallNodeState objects in namespace {"req.Name": "firewall-0", "r.Namespace": "ingress-node-fw-config-test-namespace"}
••••••••••2022/08/10 15:02:49 failed to shut down testEnv err timeout waiting for process kube-apiserver to stop
Summarizing 1 Failure:
[Fail] IngressNodeFirewall controller rules when IngressNodeFirewall objects are created for test case: "baseline test without merging" [It] The resulting IngressNodeFirewallNodeState object should look as expected
/home/akaris/development/ingress-node-firewall/controllers/ingressnodefirewall_controller_test.go:343
Ran 20 of 20 Specs in 29.974 seconds
FAIL! -- 19 Passed | 1 Failed | 0 Pending | 0 Skipped
msherif1234 commented
we aren't down yet with race issues
msherif1234 commented
ut looks better now at least been consistent with CI run so I will close this for now and if we hit another race we can reopen or create new issue