filecoin-project/lotus

Remove 20 tipset walkback for nv14+ drand beacon look-up

rvagg opened this issue · 3 comments

This is something that came out of doing some refactoring in #12428 but I've opted to leave it alone there so we can deal with this separately. I wanted to keep the look-up logic intact in #12428 because that's not what I'm trying to address there.

As per https://github.com/filecoin-project/lotus/pull/12428/files#r1750127976 and discussion in https://github.com/filecoin-project/lotus/pull/12428/files#r1750118970, for nv14+ we do precise drand round matching when looking up beacon entries on chain. We expect to always have the matching rounds that correspond to filecoin epochs, even for quicknet.

For filecoin null rounds, we don't capture beacons but we catch-up in the next non-null round, so we end up having all of the drand rounds, even for null rounds, so any filecoin epoch post-nv14 will have a drand round on chain somewhere (it might be an interesting exercise to validate this, it shouldn't be too difficult). So if we request randomness for a filecoin epoch that was null, we just go to the next non-null tipset and fish around in its list of beacons to find the one with the matching drand round that we want.

Next non-null tipset is fetched here:

lotus/chain/rand/rand.go

Lines 223 to 224 in a83d02e

randTs, err := sr.GetBeaconRandomnessTipset(ctx, filecoinEpoch, false)
if err != nil {

But then if we don't find the round we want, we walk back 20 tipsets in search of it:

lotus/chain/rand/rand.go

Lines 232 to 239 in a83d02e

for i := 0; i < 20; i++ {
cbe := randTs.Blocks()[0].BeaconEntries
for _, v := range cbe {
if v.Round == round {
return &v, nil
}
}

I believe this logical was accidentally copied from GetLatestBeaconEntry when introduced in #7376, where the 20 tipset walkback was looking for a tipset with at least one beacon entry.

  1. The walkback should never result in matching the precise round we want, so it'll always exhaust the 20 and error so we should just return an error if we can't find the round in the tipset we asked for
  2. Consider doing something similar in GetLatestBeaconEntry - is there ever a case in nv14+ that we have a tipset that doesn't have any beacon entries in it? It was originally introduced very early, April 2020, 8e0ca30 and may have made sense back then but for nv14+ I don't think it does.

If someone wants to take this on and wants to learn a bit about this - I recommend writing some code that iterates from head all the way back to nv14 (1231620 I think) and does look-up of drand round for filecoin epoch (beacon.BeaconForEpoch(filecoinEpoch).MaxBeaconRoundForEpoch(nv, filecoinEpoch)) and then finds that round in the expected epoch - either that specific epoch, or the next non-null one. It would be really interesting to verify that we have every single MaxBeaconRoundForEpoch for every single filecoin epoch, including null rounds.

You don't need a full archival node for this, just syncing from the latest snapshot should be fine because it'll include the entire tipset backbone history that you can walk.

Can I work on implementing this @rvagg?

  1. Remove the 20-tipset walkback logic for nv14+.
  2. Return an error immediately if the specific Drand round isn't found in the initial tipset.
  3. This change assumes that for nv14+, we always have the matching Drand rounds for any Filecoin epoch, including null rounds.

Proposed changes to extractBeaconEntryForEpoch:

func (sr *stateRand) extractBeaconEntryForEpoch(ctx context.Context, filecoinEpoch abi.ChainEpoch) (*types.BeaconEntry, error) {
	randTs, err := sr.GetBeaconRandomnessTipset(ctx, filecoinEpoch, false)
	if err != nil {
		return nil, err
	}
	nv := sr.networkVersionGetter(ctx, filecoinEpoch)
	round := sr.beacon.BeaconForEpoch(filecoinEpoch).MaxBeaconRoundForEpoch(nv, filecoinEpoch)
	if nv >= network.Version14 {
		// For nv14+, we expect the matching round to be in this tipset
		cbe := randTs.Blocks()[0].BeaconEntries
		for _, v := range cbe {
			if v.Round == round {
				return &v, nil
			}
		}
		return nil, xerrors.Errorf("beacon entry for round %d (epoch %d) not found in expected tipset", round, filecoinEpoch)
	} else {
		// Pre-nv14 logic
		// ...
	}
}

Kuba pointed out a logical reason for the 20 tipset walkback @ #12428 (comment) - devnets where we set a faster blocktime and still want to be able to interact with beacons. When running a blocktime faster than drand's we're going to have epochs with an empty beacon entries list, so in that case we go backward to find the one we want.

I've put documentation into the 2 places it matters in #12428 so future 👀 don't get as confused as me.

But thanks for taking a look and considering it @virajbhartiya, sorry to be the bearer of WONTFIX.