Besu failing to index event log
ajsutton opened this issue · 9 comments
Description
On one of the EF teku-besu kiln nodes, the besu instance is failing to return one of the deposit event logs. Since the CL requires all deposit events in order Teku is complaining about this quite a lot.
This is on box 164.92.203.174
JSON RPC request is:
{"jsonrpc":"2.0","method":"eth_getLogs","params":[{"topics":["0x649bbc62d0e31342afea4e5cd82d4049e7e1ee912fc0889aa790803be39038c5"],"fromBlock":"0x8547d","toBlock":"0x8547d","address":["0x4242424242424242424242424242424242424242"]}],"id":0}
The broken besu instance returns:
{
"jsonrpc" : "2.0",
"id" : 0,
"result" : [ ]
}
But another teku-besu instance on kiln returns the deposit event correctly:
{
"jsonrpc" : "2.0",
"id" : 0,
"result" : [ {
"logIndex" : "0x0",
"removed" : false,
"blockNumber" : "0x8547d",
"blockHash" : "0xa62e2b2f55447ecbf5b603fa032746222da6039f27851175a3214812029ebc70",
"transactionHash" : "0x589c0803aef2b3d0ecd48888bd0f9d799fd2421860c46bf91e2e21f0eb82414d",
"transactionIndex" : "0x3c",
"address" : "0x4242424242424242424242424242424242424242",
"data" : "0x00000000000000000000000000000000000000000000000000000000000000a000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000140000000000000000000000000000000000000000000000000000000000000018000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000030933f02fc10608e5b41513035b9f8f08306a0c54fcb2b04a673881987e6eb089b983481f5d888d4f6e08ebffbb41fcc4f000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000020005db87e87315c6545e51a5daff02eab6df948cc8c9a49d6d7d03b80d1f18096000000000000000000000000000000000000000000000000000000000000000800405973070000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000060aa19ef95787fe491388337f47fdb67936ae0aeb437766b5cc44402c8ecd08d51fe7cca7643d85fca8ec0a1c16f5102210e4b41bb6ef2f84964bdfeb733121d046e0d5cd7eb85ce8e399d0d4e61cc0cb979f58717e925e0c06a734f0311a772a00000000000000000000000000000000000000000000000000000000000000008442d000000000000000000000000000000000000000000000000000000000000",
"topics" : [ "0x649bbc62d0e31342afea4e5cd82d4049e7e1ee912fc0889aa790803be39038c5" ]
} ]
}
Somehow besu has failed to index this deposit event.
Had a second instance of this on ropsten (post merge). Resolved by stopping besu, deleting all the caches/*.cache
files and restarting besu again. So besu had the deposit log fine but created incorrect log bloom cache files.
I finally have evidence of this leading to a failed block proposal - presumably because the block would have contained deposits?
https://ropsten.beaconcha.in/block/260588
Validator successfully attested during the epoch, but failed to propose a block.
Prysm shows
ERROR powchain: Unable to process past deposit contract logs: Could not process deposit log: received incorrect merkle index: wanted 2974 but got 2980
edit Confirmed that rm caches/*.cache
resolved the issue.
I noticed in the Teku case it is calling with the same from and to values, indicating a single block. I assume it is doing that because it wants to go through the bloom cache, as opposed to just passing the hash of the single block which would not use the cache.
@SeaMonkey82 is prysm doing the same thing?
Since clearing the cache, I haven't run into this again with prysm-besu, but for reference to the exact parameters prysm is using when calling eth_getLogs
when this error occurs, you can check my thread for the same issue with erigon where it still happens consistently.
It looks like it calls for a range of 1000 blocks.
I noticed in the Teku case it is calling with the same from and to values, indicating a single block. I assume it is doing that because it wants to go through the bloom cache, as opposed to just passing the hash of the single block which would not use the cache.
We're not deliberately trying to hit or avoid the cache here - just using the same code path we use when pulling ranges to get the initial deposits. Just that once we catch up with the "head" (actually head minus follow distance) we are updating on each new block so the range to query is just a single block.
We haven't seen this issue since updating to 22.4.4. So it seems the fix that went in there has worked - we used to hit this every couple of days.
Thanks folks, closing as fixed by underlying stateroot solution.
Found this on Besu-Lodestar node on Ropsten, after restarting Lodestar, Besu was updated before to 22.7.0
lodestar-beacon | Aug-05 09:35:06.283[ETH1] error: Error updating eth1 chain cache Non consecutive deposits. deposit[6] = 1911, deposit[7] 1913
lodestar-beacon | Error: Non consecutive deposits. deposit[6] = 1911, deposit[7] 1913
lodestar-beacon | at assertConsecutiveDeposits (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/utils/eth1DepositEvent.ts:9:13)
lodestar-beacon | at Eth1DepositsCache.add (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositsCache.ts:42:5)
lodestar-beacon | at Eth1DepositDataTracker.updateDepositCache (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositDataTracker.ts:206:30)
lodestar-beacon | at runMicrotasks (<anonymous>)
lodestar-beacon | at processTicksAndRejections (node:internal/process/task_queues:96:5)
lodestar-beacon | at Eth1DepositDataTracker.update (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositDataTracker.ts:186:33)
lodestar-beacon | at Eth1DepositDataTracker.runAutoUpdate (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositDataTracker.ts:149:29)
lodestar-beacon | Aug-05 09:35:07.299[ETH1] error: Error updating eth1 chain cache Non consecutive deposits. deposit[6] = 1911, deposit[7] 1913
lodestar-beacon | Error: Non consecutive deposits. deposit[6] = 1911, deposit[7] 1913
lodestar-beacon | at assertConsecutiveDeposits (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/utils/eth1DepositEvent.ts:9:13)
lodestar-beacon | at Eth1DepositsCache.add (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositsCache.ts:42:5)
lodestar-beacon | at Eth1DepositDataTracker.updateDepositCache (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositDataTracker.ts:206:30)
lodestar-beacon | at runMicrotasks (<anonymous>)
lodestar-beacon | at processTicksAndRejections (node:internal/process/task_queues:96:5)
lodestar-beacon | at Eth1DepositDataTracker.update (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositDataTracker.ts:186:33)
lodestar-beacon | at Eth1DepositDataTracker.runAutoUpdate (file:///usr/app/node_modules/@lodestar/beacon-node/src/eth1/eth1DepositDataTracker.ts:149:29)
I am going to resync Besu from scratch, but will keep the previous data dir in case you need it to debug