anoma/ferveo

Fast subgroup checks

joebebel opened this issue · 8 comments

The protocol extensively sends/receives points on the BLS12-381 curve from third parties, there may be many subgroup checks needed to ensure that all such points lie in the prime order subgroup.

This task involves two parts:

  1. Identify every place in the protocol where such subgroup checks are necessary
  2. Implement a fast subgroup check algorithm, e.g. https://eprint.iacr.org/2019/814.pdf

Apparently Celo (https://github.com/celo-org/celo-blockchain/tree/master/crypto/bls12381) gets a big speed improvement from batching subgroup checks, its worth considering combining batching with the fast algorithm described in Sean Bowe's paper.

Fast subgroup check from Bowe's eprint/2019/814 is more efficient than multiplying by the cofactor only on the G2 case (the G1 cofactor is small).

The G2 subgroup check is (partially) done in zkcrypto/bls12_381: the clear_cofactor function uses the Bowe's trick but is not used in the is_torsion_free function.

The arkworks-rs/curves implementation does not provide a is_torsion_free function.

I forked the zkcrypto/bis12_381 into heliaxdev/bls12_381 and implemented (as an exercise) the G1 subgroup check as in Bowe's paper.

I think what's happening in the G1 case is that the implementation of multiply in bls12_381 is constant-time, therefore $[(z^2-1)/3] P$ costs exactly the same as $[q] P$ which would make the fast subgroup check actually slower. In order to actually take advantage of the fast test, the final multiply needs to only work on [u8; 16] instead of [u8; 32] otherwise it will continue to double the base point

Actually the celo implementation I linked to is written in Go, and while it uses Bowe's method, it doesn't do the batching.

The batching is instead implemented in zexe (celo-org/zexe#4) and it seems like the performance speedup is substantial. So now we have yet another dependency issue to deal with, as zexe seemes like an arkworks fork?

(as exercies) I have done the is_torsion_free_optimized functions for G1 and G2 using the Bowe's trick and the gain is significant as expected. See heliaxdev/bls12_381 commit de80c8ab4cd2ceb2b7b9026f2571546695eaeb26 and 6a8eb9f9c534bf035a407d17081fa2e313bb0e1d for details.

Probably going to be integrated into arkworks anyway, so nothing probably required from our end right now.

#58 (comment) provide benchmarks of the fast subgroup check for G1 and G2.