Incorrect ViewChange messages consensus calculating
Toktar opened this issue · 1 comments
The simulation test for ViewChange sometimes fails
because of problem with getting a checkpoint on the phase of collecting
ViewChange
messages in the method calc_checkpoint()
. It receives a list of ViewChange
messages like a parameter.If it's a 4 node pool and the list contains the follow
ViewChange
messages
- ViewChange_1 | checkpoints_ends: 10, 20 | stable_checkpoint: 10
- ViewChange_2 | checkpoints_ends: 0, 10 | stable_checkpoint: 0
- ViewChange_2 | checkpoints_ends: 0 | stable_checkpoint: 0
Then we don't have a strong consensus of 3 (n-f=4-1) checkpoins with the same checkpoint end. It means, that the node can't finish a view change.
Expected problem: Low probability one or more nodes may not finish View Change and after a short period just start a new one.
With an incredibly low probability a pool can freeze with endless view changes. But it can be fixed by POOL_RESTART
transaction.
We don’t think we have a big chance to face this case. But we need to remember about it and fix.
When addressing this issue, please ensure any workarounds such as this are addressed;
indy-plenum/plenum/test/consensus/view_change/test_sim_view_change.py
Lines 90 to 92 in 705582e