waku-org/research

Cross-check results from multiple servers

Opened this issue · 2 comments

Without consensus, responses of different servers to the same request may differ. If the client wants to minimize the risk of missing a prior message, it may query multiple servers and merge the results.

Two sub-questions arise:

  • how to merge results (#40);
  • how to adjust reputation of servers w.r.t. their inconsistent responses? this is part of the reputation system, global (#48) or local (#49). The question is non-trivial: if server A reports a message that server B does not, we don't know whether A is inserting fake messages into the history, or B is censoring a real message, or A and B are both honest but their view of history has diverged naturally.

Comment by @chaitanyaprem moved here from the PR discussion:

How can the client verify if the server has honoured the request?

What if the client queries for messages for the last 1 hour and the server responds with 10 messages (to save resource usage or for some other reason), whereas there are 15 messages available for the queried time?

I am wondering if this reputation could be somehow also be tied to store-sync protocol or is it the client's responsibility to query a secondary store node to confirm the authenticity of response from first? In which case the client should not be charged for 2 queries, rather only 1 and first store node could be disincentivized.

I know this is bringing in complexity, but better to think of a plan to handle this.

If store sync protocol are provided for free by a server, then it may help with this.

For example, Alice can do a store-sync query to Bob and Carole.

Alice can then compare Bob and Carole:

  • If same result, select the cheapest provider
  • If different result, query most messages from the cheapest provider, query remaining messages from more expensive provider.

The store sync query acts as a form a commitment from Bob and Carole that they are storing specific messages.
This means that now the problem is shifted in terms of consistency between a store sync query (Here are the ids of the msgs I have available) and the store query (I am providing the payload of the msgs I said I had).

Which might be simpler to handle.