bloomberg/goldpinger

When PING_NUMBER is nonzero, there are many nodes that are immediately marked as unhealthy

Opened this issue · 0 comments

Describe the bug
If I start off with 50 healthy nodes and then set PING_NUMBER to 20, I notice that roughly 30 nodes get marked as unhealthy. It appears that 20 nodes ping 20 nodes instead of 50 (all) nodes pinging 20 nodes. When I unset PING_NUMBER, the nodes go back to being healthy.

I think the bug is here. It seems to me that AllPods() should be "checked" but only SelectPods() are actually being checked which results in SelectPods() pinging SelectPods().

Expected behavior
All nodes ping PING_NUMBER nodes and no nodes are marked as unhealthy as a result of applying change to set PING_NUMBER.