filecoin-project/go-f3

Use delayed power table to validate or drop messages from future instances

anorth opened this issue · 8 comments

We have agreed that voting weights for an instance should come from the power table resulting from not the immediately previous instance, but some distance (10 instances / 5 minutes? longer) further back. This allows us to do proper message validation and avoid retransmitting bad messages.

Since the power table is given from the node, it might be that theres little or no code change required in GPBFT, but (1) confirm this, and (2) ensure that multi-instance tests do this properly (#295), and (3) let's use #125 to help us gain confidence in the lookback distance.

The point of this is to be able to validate message from near-future instances (from a node's point of view). So there will be work here to implement that validation and queue of validated messages. This may involve changes to the host API so the participant can reliably keep track of N power tables and have the right one on hand for any message.

I'm closing #130 and expanding the scope here a little with a task list including maintaining the delayed power tables and using them to validate messages.

The host (Lotus) must end up with a store of (instance, finalised tipset) records somewhere in order to be able to bootstrap the protocol when a node starts up. F3 couldn't otherwise know what instance it's up to or what power table to use. So, the API will assume that F3 can ask the host for the power table corresponding(*) to an instance. The host can map instance -> Tipset -> Epoch and then go build the power table that F3 needs. F3 can cache the results.

A significant design question is whether the lookback parameter should be internal to F3, or encapsulated by the host. I.e. does (*) corresponding to mean the power table finalised by an instance, or the power table to be used for an instance. It initially seems natural to make the parameter internal to F3, but a few things push back against that:

  • All the sim testing infrastructure is set up to associate the power table to be used for an instance. That also makes tests much easier to write and independent of that parameter value. So the sim would have to compute the reverse offset to feed F3 the power table from an instance (which would initially be a bunch of genesis tips). The associations in the test code would be different to the production code, which would be confusing.
  • If the API is finalised by, then F3 also needs a way to ask for the gensis power table. With instance numbers as uint64 there's very high risk of an underflow computing current - offset to find the right instance. We could add an explicit API for fetching genesis, but then we need underflow checks and branches anywhere that calls these methods. We could alternatively make instance numbers a signed int64, and then infer genesis from any negative instance. (I would choose this option, using int64 throughout)

Thus, I am first going to encapsulate the offset parameter in the host. F3 will ask for the power table it should use for an instance. This means the offset configuration lives in the host, and subtraction of the offset happens on the host side.

It's not perfect, but I think we'd introduce a bunch of unnecessary complexity to try to keep the parameter in F3: reworking the simulation testing setup to account for offsets, adjusting tests that use power table fluctuations, converting instance to int64 everywhere. We can always come back to do this later if we don't like it.

IMO, we should use int64 for instances regardless (I've seen too many issues with MaxUint64 overflowing int64).

That aside, I don't think having the lookback inside go-f3 will actually be all that difficult:

  1. We can still initialize instances with the desired power table (no lookback).
  2. A GetPowerTableFromInstance method can simply assert that the passed instance has the correct lookback for the instance being simulated, then return the power table.

Regarding Lotus having to store the power table, we are storing finality certificates, which should contain the power tables that are being finalized.

Well, the finality certificates only store power table diffs. But looking up the power table associated with an instance isn't difficult (instance - lookback_distance -> head ts -> power table).

See https://github.com/filecoin-project/go-f3/pull/273/files#r1610685278 for an example of the power tables I'll need if we implement #257.

Testing effectiveness of this broken out to #295