RFC: Add a way to ask for a storage proof of just the hash of a value

Question

RFC: Add a way to ask for a storage proof of just the hash of a value

tomaka opened this issue 2 years ago · 8 comments

Context and motivation

In the Polkadot network, there are two kinds of nodes: full nodes, that have the storage of recent blocks, and light clients, that don't. In order for light clients to work, they need to request from full nodes the storage items that they want to access. To do so, they send a networking request.

When it receives such networking request, the full node returns a Merkle proof containing the values that have been requested.

In some situations, the light client doesn't care about the value of the storage items that it wants to access, but only their hash. What this RFC proposes is to add a way for the light client to indicate, in the networking request, that it doesn't care about the values of the storage items that it requests but only their hash. When this is the case, the full node can omit the storage values from the Merkle proof that it sends back, and thus reduce its size, sometimes considerably.

This would be helpful, for example:

In order to check whether a storage item still matches a locally-known value without downloading it. This is important for big storage items such as :code.
- In particular, when the smoldot light client connects to a chain, almost all of its initialization time and bandwidth are used to download :code from a full node. By implementing this proposal, the light client could store the value of :code on the disk and check if it is still the same as on chain instead of always downloading it.
In order to enumerate the list of keys in the storage, in order to answer the JSON-RPC requests that require you to enumerate keys.

I won't go in details as to how values can be omitted from Merkle proofs, as this is a bit complicated and out of scope here, however it is important to know that in order for values to be omitted, a chain must have transitioned to the "version 1" of the trie format by switching the state_version field in the runtime spec to 1. If it hasn't done this, it is not possible to omit storage values, as otherwise the proof would no longer be verifiable.
Chains that are still using version 0 of the trie format should still implement this proposal, but they would leave the values in the proof like is currently the case.
It is also technically possible for a chain to be in the middle of a version 0 -> version 1 trie format transition, but this isn't problematic either, as values can be left in the Merkle proof when the corresponding storage item is still at version 0.

Changes

The current Protobuf schema for the "light client requests" protocol can be found here: https://github.com/paritytech/substrate/blob/08d1b2c7846f0280aa57dc1145a2c631b12c0789/client/network/light/src/schema/light.v1.proto

This RFC suggests the following diff:

@@ -11,6 +11,7 @@ message Request {
                RemoteReadRequest remote_read_request = 2;
                RemoteReadChildRequest remote_read_child_request = 4;
                // Note: ids 3 and 5 were used in the past. It would be preferable to not re-use them.
+               RemoteReadRequestV2 remote_read_request_v2 = 6;
        }
 }
 
@@ -48,6 +49,21 @@ message RemoteReadRequest {
        repeated bytes keys = 3;
 }
 
+message RemoteReadRequestV2 {
+       required bytes block = 2;
+       optional ChildTrieInfo child_trie_info = 3;  // Read from the main trie if missing.
+       repeated Key keys = 6;
+}
+
+message ChildTrieInfo {
+       enum ChildTrieNamespace {
+               DEFAULT = 1;
+       }
+
+       required bytes hash = 1;
+       required ChildTrieNamespace namespace = 2;
+}
+
 // Remote read response.
 message RemoteReadResponse {
        // Read proof. If missing, indicates that the remote couldn't answer, for example because
@@ -65,3 +81,8 @@ message RemoteReadChildRequest {
        // Storage keys.
        repeated bytes keys = 6;
 }
+
+message Key {
+       required bytes key = 1;
+       optional bool skipValue = 2; // Defaults to `false` if missing
+}

The new message (RemoteReadRequestV2) is similar to RemoteReadRequest and RemoteReadChildRequestV2, except for the fact that the repeated bytes keys field is replaced with repeated Key keys.
The new Key struct contains the key itself (what would normally be passed in the repeated bytes keys field), but also a skipValue boolean indicating that the value of this storage item isn't important to the requester.

The skipValue field has been defined as an optional boolean, because I expect the value to be false most of the time. When the value is false, it is possible to omit the field entirely, saving 2 bytes. Because this only saves 2 bytes, I don't think that this greatly matters anyway.

The skipValue field must be considered as an optimization. It is not legal for the remote to ignore this field and send back a Merkle proof that contains the value anyway. As explained above, this is necessary in order to handle version 0 of the trie format.

An alternative proposal could have been to add a request-wide skipValue field applying to all the requested keys, instead of one skipValue field per key. It is unclear to me whether there is any real advantage to having one field per key, but having one field per key is a strictly more superior solution, so in doubt I have decided to go for this one.

Upgrade path

The upgrade path is simple: once a version of Substrate supporting this proposal is released and widely deployed, light clients can start sending requests.

Having a request-wide skipValue field instead of one per field would provide more backwards-compatibility, as older nodes would simply ignore the flag. As such, light clients could start sending requests without having wait for a version supporting this proposal to be deployed. However, sending requests with this flag when no full node supports it is kind of useless.

Given that a light client must already handle situations where a full node refuses to answer requests for no particular reason, having a situation where a full node doesn't answer requests because it is too old to support this proposal doesn't add any complexity to implementations.

Misc

See also the corresponding Substrate issue: paritytech/substrate#10623

tomaka commented 2 years ago

cc @cheme

Answer 1 · 2022-11-24T09:19:53.000Z

Proposal sounds good to me, just a small question, should we merge both request v2 into :

message RemoteReadRequestV2 {
       required bytes block = 2;
       optional bytes storage_key = 3;
       repeated Key keys = 6;
}

Answer 2 · 2022-11-24T09:25:27.000Z

Or:

message RemoteReadRequestV2 {
       required bytes block = 2;
       optional StateInfo state_info = 3;
       repeated Key keys = 6;
}
message StateInfo {
       required bytes name;
       optional StateType type; 
}
enum StateType {
       DEFAULT,
}

(probably not a correct proto syntax, but the idea is here).
This way instead of having long form of storage_key (with all the prefixing), we just use name which is the unprefixed storage (so smaller to encode and extensible).

Answer 3 · 2022-11-24T09:47:50.000Z

To be honest my understanding of child tries is still blurry enough that I don't have an opinion here.

Answer 4 · 2022-11-25T09:00:42.000Z

Note: Messages are defined in the #591 PR, but this issue is part of the larger light client spec. As discussed yesterday with Bhargav, he will cover this behavior there and also spec how any storage values are verified against the merkle root/nodes.

Answer 5 · 2022-11-29T13:59:35.000Z

Or:
message RemoteReadRequestV2 {
       required bytes block = 2;
       optional StateInfo state_info = 3;
       repeated Key keys = 6;
}
message StateInfo {
       required bytes name;
       optional StateType type; 
}
enum StateType {
       DEFAULT,
}
(probably not a correct proto syntax, but the idea is here). This way instead of having long form of storage_key (with all the prefixing), we just use name which is the unprefixed storage (so smaller to encode and extensible).

I've updated the OP to do this except that I've adjusted the naming a bit. The naming doesn't actually matter that much, as the names aren't sent on the wire.

Answer 6 · 2022-12-07T15:53:11.000Z

It is not legal for the remote to ignore this field and send back a Merkle proof that contains the value anyway.

You mean "It is legal for the remote to ignore, ...". As otherwise it would not work for state-version 0? (As you say yourself in the next sentence)

Besides this one nitpick, it looks good to me!

Answer 7 · 2023-02-21T10:21:36.000Z

Closing in favor of w3f/PPPs#5