The later nodes got panic when a large number of them are started
torao opened this issue · 0 comments
torao commented
Tendermint version (use tendermint version
or git rev-parse --verify HEAD
if installed from source):
0.33.6-0.3-17077261
Environment:
- OS (e.g. from /etc/os-release): CentOS 7
- Install tools:
- Others:
What happened:
When I started the BLS signature aggregation version with 100 nodes for performance testing, the following error occurred on the nodes after 70-80 in the startup order and the process aborted.
I[2021-02-10|08:45:36.399] Version info module=main software=0.33.6 block=10 p2p=7
panic: Aggregated commit cannot make a VoteSet
goroutine 1 [running]:
github.com/tendermint/tendermint/types.CommitToVoteSet(0xc0012f2300, 0x11, 0xc00012af00, 0xc00047eca0, 0x0)
github.com/tendermint/tendermint/types/block.go:765 +0x4d9
github.com/tendermint/tendermint/consensus.(*State).reconstructLastCommit(0xc0004d1600, 0xa, 0x0, 0xc0012df5c0, 0x6, 0xc0012f2300, 0x11, 0xc0012df5c8, 0x2, 0xc0012f2320, ...)
github.com/tendermint/tendermint/consensus/state.go:543 +0x8b
github.com/tendermint/tendermint/consensus.NewState(0xc0001442d0, 0xa, 0x0, 0xc0012df5c0, 0x6, 0xc0012f2300, 0x11, 0xc0012df5c8, 0x2, 0xc0012f2320, ...)
github.com/tendermint/tendermint/consensus/state.go:222 +0x51c
github.com/tendermint/tendermint/node.createConsensusReactor(0xc00013a160, 0xa, 0x0, 0xc0012df5c0, 0x6, 0xc0012f2300, 0x11, 0xc0012df5c8, 0x2, 0xc0012f2320, ...)
github.com/tendermint/tendermint/node/node.go:383 +0x19b
github.com/tendermint/tendermint/node.NewNode(0xc00013a160, 0x13dfaa0, 0xc0002483c0, 0xc001311de0, 0x13c19a0, 0xc001315260, 0xc00131a080, 0x127b4f0, 0xc00131a090, 0x13df960, ...)
github.com/tendermint/tendermint/node/node.go:658 +0xa13
github.com/tendermint/tendermint/node.DefaultNewNode(0xc00013a160, 0x13df960, 0xc001314b60, 0xc000277c58, 0xdb6bdd, 0xc000117080)
github.com/tendermint/tendermint/node/node.go:102 +0x544
github.com/tendermint/tendermint/cmd/tendermint/commands.NewRunNodeCmd.func1(0xc000117080, 0xc000086840, 0x0, 0x1, 0x0, 0x0)
github.com/tendermint/tendermint/cmd/tendermint/commands/run_node.go:106 +0x7a
github.com/spf13/cobra.(*Command).execute(0xc000117080, 0xc000086830, 0x1, 0x1, 0xc000117080, 0xc000086830)
github.com/spf13/cobra@v1.0.0/command.go:842 +0x453
github.com/spf13/cobra.(*Command).ExecuteC(0x1b51e20, 0x2, 0xc000019100, 0x1121f4b)
github.com/spf13/cobra@v1.0.0/command.go:950 +0x349
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.0.0/command.go:887
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0x1b51e20, 0x127cf68, 0x2, 0xc00002f040)
github.com/tendermint/tendermint/libs/cli/setup.go:89 +0x3c
main.main()
github.com/tendermint/tendermint/cmd/tendermint/main.go:48 +0x2f5
This is reproducible in my environment and seems to occur on nodes that were later attempted to be started, but it is uncertain how many nodes will fail to start.