Possible improvements in subgroup testing
arielgabizon opened this issue · 7 comments
A significant part of Sapling transaction verification cost relates to checking certain elements are in certain subgroups.
We show that for SNARK proof elements, when using (at least the current implementation of) the Ate pairing, G1 checks are unnecessary.
Let F = Fqk be the target field of the pairing.
We will use
Lemma: In the Ate pairing, for any a ∈ G1, b ∈ G2, p ∈ E(Fq): e(a + [r] p, b) = e(a, b) where r is G1's order.
Proof:
This is because the Ate pairing is constructed from the Tate pairing with a switch of the arguments.
So it is in fact a map e: E(F)/[r]E(F) × E(F)[r]. I.e. the first argument doesn't have to be of prime order,
and the pairing result doesn't change when adding to the first argument any r'th multiple.
See a code test of this here: https://github.com/arielgabizon/pairing/tree/pairing-outside-subgroup
We can use this fact to show that given any a ∈ E(Fq) we can efficiently move to a' in G1 that is equivalent to a in terms of the pairing.
Specifically,
Claim: Let c be the co-factor of G1 in E(Fq), and let d be c's inverse in Fr.
Then for any a ∈ E(Fq) and b ∈ G2
e([d] [c] a, b) = e(a, b).
Proof: It can be seen that we can write a = A + [r] B for some A ∈ G1 and B ∈ E(Fq).
Then
e([d] [c] a, b) = e([c] a, b)d = e([c] A + [r] [c] B, b)d
Using the Lemma this is
= e([c] A, b)d
= e([d] [c] A, b)
= e(A, b)
Using the Lemma again this is
= e(a, b)
Another possible optimization is to batch subgroup checks. The most natural way of doing this - taking
a combination of all elements to be tested with random coefficients, does not work well when the subgroup is of small co-factor (or even of co-factor with small factors :) ) inside a larger cyclic group.
We propose to do it differently, based on the following claim:
Claim:
Suppose H is a subgroup of G, and a1, .., at are elements of G; of which at least one element is not in H.
Then for at least 1/2 of the subsets T of {1, .., t}
sumi ∈ T ai is not in H.
Proof:
Assume w.l.g that a1 is not in H.
We can think of the subsets of {1, .., t} as a disjoint union of the pairs {T, T ∪ {1}}.
Where T goes over all subsets of {2, .., t}.
It is easy to see that for each such pair, at most one of the corresponding sums is in H.
The claim suggests the following test: If the desired error is 2-s. Choose s random subsets T
and for each one check if
sumi ∈ T ai is in H.
Reject if any of the s checks failed.
You could take random coefficients coprime to the cofactor, no? Are there so many divisors of these cofactors so that avoiding them all gets too slow?
@burdges so we considered that but turned out even with co-prime coefficients the cheating probability is as large as 1/(smallest factor in cofactor)
In general, whether a pairing algorithm needs its inputs to be in the subgroup will depend on the implementation of the Miller loop. It's entirely possible that the formulae for "doubling with line evaluation" and "addition with line evaluation" may only operate correctly in the subgroup.
There is lower-hanging fruit than this when it comes to optimizing validation; for example, we should first implement batch verification of proofs and signatures as described in Appendix B of the spec.
This isn't a candidate for Blossom because there is no change we can make to the definition of what is a valid proof that has been sufficiently well analysed and that helps to optimize the subgroup check. (If there are optimizations that can be done without changing the definition of a valid proof, they could be done at any time, not just at an upgrade.) This might have changed by NU3, though.
@ebfull has very recently published the paper Faster Subgroup Checks for BLS12-381.
For discussion of using the technique in that paper for recursive validation in R1CS, see #3425 (comment) . Here I'll focus on how to compute the scalar multiplication by (z2 - 1)/3 used in the G1 subgroup check.
We need to find an efficient addition-subtraction chain for (z2 - 1)/3, which in binary representation is 111001011011001000110000000000010101010101010111100001010101100000000000000000000000000000000001010101010101010101010101010101. The attached Python program expresses and checks an addition chain (it doesn't use subtraction) for this scalar, which uses 125 doubles and 18 adds. The only mildly tricky part is to compute [1010101] P in order to handle the runs of alternating 1s and 0s efficiently.
I don't claim that this is an optimal chain, but it must be within a few operations of optimal. I considered using the decomposition (|z| - 1).((|z| + 1)/3 but it gave no improvement.
(Sorry if it isn't self-explanatory what chain this represents. I'll clean it up if I have time.)
Orchard will use prime-order curves so will not have this issue. Leaving it open in case we still want to optimize subgroup testing for BLS12-381. In any case this comment still applies:
There is lower-hanging fruit than this when it comes to optimizing validation; for example, we should first implement batch verification of proofs and signatures as described in Appendix B of the spec.