high rate of Peers.garbageCollect() calls resulting from topology gossip
murali-reddy opened this issue · 0 comments
murali-reddy commented
mesh performs peers.garbageCollect
on two occasions:
- when direct topology changes is seen by the peer when connections gets deleted
- when performing
OnGossipBroadcast
for topology gossip data
Peers.garbageCollect() invokes Peer.routes() which is O(n^2) operation hence results in significant CPU usage when there is significant topology change.
Following metrics (gathered as calls per second) were gathered with instrumented mesh on 150 node kubernetes cluster with weave-net using the mesh. Its not uncommon for some one to apply daemon set which results each node connecting to rest of the peers (so n^2 connection) resulting in significant topology changes and hence topology gossip.
===================================================================
2019-09-17 7:22:0 Peers.garbageCollect(): 365
2019-09-17 7:22:0 routes.calculate() -> routes.calculateBroadcast(): 59
2019-09-17 7:22:0 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 335
2019-09-17 7:22:0 routes.calculateUnicast(): 119
2019-09-17 7:22:0 connectionMaker.refresh(): 63
2019-09-17 7:22:0 rx gossip unicast: 0
2019-09-17 7:22:0 rx gossip broadcast: 325
2019-09-17 7:22:0 gossip broadcast - relay broadcasts: 345
2019-09-17 7:22:0 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:1 Peers.garbageCollect(): 347
2019-09-17 7:22:1 routes.calculate() -> routes.calculateBroadcast(): 68
2019-09-17 7:22:1 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 328
2019-09-17 7:22:1 routes.calculateUnicast(): 135
2019-09-17 7:22:1 connectionMaker.refresh(): 70
2019-09-17 7:22:1 rx gossip unicast: 0
2019-09-17 7:22:1 rx gossip broadcast: 316
2019-09-17 7:22:1 gossip broadcast - relay broadcasts: 324
2019-09-17 7:22:1 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:2 Peers.garbageCollect(): 369
2019-09-17 7:22:2 routes.calculate() -> routes.calculateBroadcast(): 61
2019-09-17 7:22:2 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 313
2019-09-17 7:22:2 routes.calculateUnicast(): 124
2019-09-17 7:22:2 connectionMaker.refresh(): 64
2019-09-17 7:22:2 rx gossip unicast: 0
2019-09-17 7:22:2 rx gossip broadcast: 315
2019-09-17 7:22:2 gossip broadcast - relay broadcasts: 343
2019-09-17 7:22:2 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:3 Peers.garbageCollect(): 336
2019-09-17 7:22:3 routes.calculate() -> routes.calculateBroadcast(): 75
2019-09-17 7:22:3 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 327
2019-09-17 7:22:3 routes.calculateUnicast(): 148
2019-09-17 7:22:3 connectionMaker.refresh(): 75
2019-09-17 7:22:3 rx gossip unicast: 0
2019-09-17 7:22:3 rx gossip broadcast: 322
2019-09-17 7:22:3 gossip broadcast - relay broadcasts: 326
2019-09-17 7:22:3 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:4 Peers.garbageCollect(): 353
2019-09-17 7:22:4 routes.calculate() -> routes.calculateBroadcast(): 69
2019-09-17 7:22:4 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 344
2019-09-17 7:22:4 routes.calculateUnicast(): 138
2019-09-17 7:22:4 connectionMaker.refresh(): 71
2019-09-17 7:22:4 rx gossip unicast: 0
2019-09-17 7:22:4 rx gossip broadcast: 339
2019-09-17 7:22:4 gossip broadcast - relay broadcasts: 337
2019-09-17 7:22:4 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:5 Peers.garbageCollect(): 323
2019-09-17 7:22:5 routes.calculate() -> routes.calculateBroadcast(): 68
2019-09-17 7:22:5 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 330
2019-09-17 7:22:5 routes.calculateUnicast(): 136
2019-09-17 7:22:5 connectionMaker.refresh(): 70
2019-09-17 7:22:5 rx gossip unicast: 0
2019-09-17 7:22:5 rx gossip broadcast: 328
2019-09-17 7:22:5 gossip broadcast - relay broadcasts: 311
2019-09-17 7:22:5 gossip broadcast - topology updates: 3
===================================================================
2019-09-17 7:22:6 Peers.garbageCollect(): 340
2019-09-17 7:22:6 routes.calculate() -> routes.calculateBroadcast(): 78
2019-09-17 7:22:6 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 320
2019-09-17 7:22:6 routes.calculateUnicast(): 156
2019-09-17 7:22:6 connectionMaker.refresh(): 82
2019-09-17 7:22:6 rx gossip unicast: 0
2019-09-17 7:22:6 rx gossip broadcast: 321
2019-09-17 7:22:6 gossip broadcast - relay broadcasts: 322
2019-09-17 7:22:6 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:7 Peers.garbageCollect(): 321
2019-09-17 7:22:7 routes.calculate() -> routes.calculateBroadcast(): 85
2019-09-17 7:22:7 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 300
2019-09-17 7:22:7 routes.calculateUnicast(): 172
2019-09-17 7:22:7 connectionMaker.refresh(): 90
2019-09-17 7:22:7 rx gossip unicast: 0
2019-09-17 7:22:7 rx gossip broadcast: 296
2019-09-17 7:22:7 gossip broadcast - relay broadcasts: 309
2019-09-17 7:22:7 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:8 Peers.garbageCollect(): 313
2019-09-17 7:22:8 routes.calculate() -> routes.calculateBroadcast(): 81
2019-09-17 7:22:8 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 308
2019-09-17 7:22:8 routes.calculateUnicast(): 161
2019-09-17 7:22:8 connectionMaker.refresh(): 85
2019-09-17 7:22:8 rx gossip unicast: 0
2019-09-17 7:22:8 rx gossip broadcast: 309
2019-09-17 7:22:8 gossip broadcast - relay broadcasts: 291
2019-09-17 7:22:8 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:9 Peers.garbageCollect(): 316
2019-09-17 7:22:9 routes.calculate() -> routes.calculateBroadcast(): 84
2019-09-17 7:22:9 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 307
2019-09-17 7:22:9 routes.calculateUnicast(): 167
2019-09-17 7:22:9 connectionMaker.refresh(): 88
2019-09-17 7:22:9 rx gossip unicast: 0
2019-09-17 7:22:9 rx gossip broadcast: 302
2019-09-17 7:22:9 gossip broadcast - relay broadcasts: 306
2019-09-17 7:22:9 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:10 Peers.garbageCollect(): 312
2019-09-17 7:22:10 routes.calculate() -> routes.calculateBroadcast(): 83
2019-09-17 7:22:10 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 278
2019-09-17 7:22:10 routes.calculateUnicast(): 166
2019-09-17 7:22:10 connectionMaker.refresh(): 85
2019-09-17 7:22:10 rx gossip unicast: 0
2019-09-17 7:22:10 rx gossip broadcast: 275
2019-09-17 7:22:10 gossip broadcast - relay broadcasts: 300
2019-09-17 7:22:10 gossip broadcast - topology updates: 2
===================================================================