Finalizer with 2/3 of total deposit stops finalizing after a few epochs
Opened this issue · 1 comments
Describe the bug
Consider the following setup:
- There are two finalizers.
- The first finalizer holds 2/3 of the total deposit and is able to finalize an epoch alone.
- Only the first finalizer casts votes.
After a few finalized epochs finalization fails. For some reason, the votes of the first finalizer become unable to finalize an epoch.
To Reproduce
The following test
#!/usr/bin/env python3
from test_framework.test_framework import UnitETestFramework
from test_framework.util import (
assert_finalizationstate,
disconnect_nodes,
generate_block,
)
ESPERANZA_CONFIG = '-esperanzaconfig={"epochLength": 5}'
class DepositBug(UnitETestFramework):
def set_test_params(self):
self.num_nodes = 3
self.extra_args = [
[ESPERANZA_CONFIG],
[ESPERANZA_CONFIG, '-validating=1'],
[ESPERANZA_CONFIG, '-validating=1'],
]
self.setup_clean_chain = True
def run_test(self):
p0, v0, v1 = self.nodes
self.setup_stake_coins(p0, v0, v1)
# Leave IBD
self.generate_sync(p0)
v0_addr = v0.getnewaddress("", "legacy")
v0.deposit(v0_addr, 3000)
v1_addr = v1.getnewaddress("", "legacy")
v1.deposit(v1_addr, 1500)
generate_block(p0, count=14)
disconnect_nodes(p0, v0.index)
disconnect_nodes(p0, v1.index)
disconnect_nodes(v0, v1.index)
generate_block(p0, count=1)
assert_finalizationstate(p0, {'currentEpoch': 4,
'lastJustifiedEpoch': 2,
'lastFinalizedEpoch': 2,
'validators': 2})
self.wait_for_vote_and_disconnect(finalizer=v0, node=p0)
generate_block(p0, count=5)
assert_finalizationstate(p0, {'currentEpoch': 5,
'lastJustifiedEpoch': 3,
'lastFinalizedEpoch': 3})
self.wait_for_vote_and_disconnect(finalizer=v0, node=p0)
generate_block(p0, count=5)
assert_finalizationstate(p0, {'currentEpoch': 6,
'lastJustifiedEpoch': 4,
'lastFinalizedEpoch': 4})
self.wait_for_vote_and_disconnect(finalizer=v0, node=p0)
generate_block(p0, count=5)
assert_finalizationstate(p0, {'currentEpoch': 7,
'lastJustifiedEpoch': 5,
'lastFinalizedEpoch': 5})
if __name__ == '__main__':
DepositBug().main()
fails with the error:
Traceback (most recent call last):
File "test/functional/test_framework/test_framework.py", line 203, in main
self.run_test()
File "test/functional/finalization_deposit_bug.py", line 87, in run_test
'lastFinalizedEpoch': 5})
File "test/functional/test_framework/util.py", line 313, in assert_finalizationstate
raise AssertionError('\n\t\t\t\t'.join(errors))
AssertionError: lastJustifiedEpoch: not(4 == 5)
lastFinalizedEpoch: not(4 == 5)
Expected behavior
Finalizer with 2/3 of total deposit should be able to finalize any epoch if it's the only finalizer which votes.
@Gnappuraz and I reviewed it, and this issue occurs because we want to finalize epoch not when 2/3 of votes are reached but when ">2/3" of votes are reached. The way how it's implemented, there is an equal comparison ">=2/3" but! current total votes don't include the current reward factor; however, the threshold (2/3) includes it which "more or less" turns this ">=" sign into ">" one.
Here we can see https://github.com/dtr-org/unit-e/blob/master/src/esperanza/finalizationstate.cpp#L529-L532 that reward was attributed to m_cur_dyn_deposits
and m_prev_dyn_deposits
but not to curDynastyVotes
or prevDynastyVotes
This implementation we borrowed from Casper, and I think we should change it to have a strict >2/3
comparison and attribute current reward to both, total votes and the threshold. Then we won't have this "weird behavior" when the current reward factor is >0, and now we see that 2/3 of votes don't finalize when they used to finalize. Another issue with the current implementation is that if the reward factor is too large (because there are no finalization for decades*), you'll never able to finalize, even if everyone votes