Detach state persistence and block persistence
erikzhang opened this issue ยท 15 comments
Currently, state persistence occurs at the same time as the block is persisted, and it consumes more time and resources. This results in block relay and synchronization slowing.
If we separate state persistence from block persistence, the consensus node can quickly process transactions without waiting for state writes and smart contract execution.
Currently, it is hard for light clients to get information on smart contract invocations from the block themselves. If a radix tree/patricia trie was implemented to save each of the invocation states in the block as another root, this problem would be solved. There would be no need to run an api server to get the information needed or run have a vm. For this to happen, the state persistence would need to happen before the block persistence, so that the state root can be persisted in the block header.
The number of nodes willing to run an api server to feed smart contract information may also be small which will lead to slower performances with light clients, and more of a nodes resources being used to serve light clients, instead of relaying blocks.
One solution to the block relay and synchronisation problem is by having more nodes on the network whose job is solely to relay blocks.
Decoupling state and block persistence sounds interesting.
It would also have the effect that transaction throughput would no longer equal smart contract or state change throughput - and you may have trouble with any SC's that require any time guarantees on execution (we could be on block 4,000,000 but the nodes are still working their way through block 3,995,000's SC execution.
How would you know what state the nodes were on with regards to state changes?
We can expose three interfaces:
header_height
block_height
state_height
Would this not make the network slower, as nodes would need to be up to date in terms of state_height
and block_height
to be able to validate the incoming blocks?
Nodes don't need to wait for state_height
to validate the imcoming blocks. They have enough information in headers to validate blocks.
But if they want to validate the incoming transactions, they need to wait for state_height
.
Yes you are right, I meant to say transactions.
I think some tests need to be done to see what the gap between state_height
and block_height
will roughly be under stressful conditions. Because if I am understanding correctly, it would mean that if node1 is at state_height = 200
and node2 is at state_height = 300
if I called getbalance from each node, I may get different values from node1 and node2
Maybe getbalance on other invoke function calls could set a minimum state height as an optional parameter which would basically error if the node hasn't reached that height yet.
@f27d Yes that could be a workaround for clients connecting to a node. I think tests would need to be ran to verify that the increase in speed, is justifiable, and that there are no attack vectors, if for example 40% of the nodes are behind in state_height
A node won't accept any transaction if state_height != block_height || block_height != header_height
.
Creating and maintaining a Patricia tree for storing the state of balances and putting the root hash of it in the blockheader will also solve the trust problem of the light clients. Currently, a light client has to trust the answers coming from the api server. However, if an api server is comprimised, it can return incorrent answers for a dedicated set of light clients. To best of my knowledge, currently there is no mean for the light clients to verify the answers coming from the api server. If the answers are accompanied with a merkle state proof, then the light clients only storing the block headers can verify the answers.
@erikzhang I think it is a great idea, it can also pave the way to increase light clients' trust level which can lead to different kinds of dApps work efficiently on the blockchain without using tremendous amount of transactions/queries that their clients could make. What information will you store in state_height
in particular? Do you mean converting state into merkle patiricia tree as @muratyasin suggested?
Fellows, this NEP proposal is directly related to this issue: neo-project/proposals#75
Feel free to contribute or join us as a co-author :)
I think this is a great idea... can we keep discussing this for Neo 3.0?
@erikzhang this is a genious idea!! I've never imagine it before hahaah I think I know how to manage this detached states, not including on block headers, while still guaranteeing things are correct ;) I'll try to prototype the idea!
If we have this implemented, why would we need the block headers? What matters are the transactions, right? If we can persist them without having to add the header, then what is the header used for?
Or you think that they are run, added to storage, and removed if there is a disagreement later? I understand the idea, I'm trying to see what are our design options.