herumi/bls-eth-go-binary

Can we change the constant number for G2-cofactor?

herumi opened this issue · 25 comments

eth2.0-specs requires

G2_cofactor = 305502333931268344200999753193121504214466019254188142667664032982267604182971884026507427359259977847832272839041616661285803823378372096355777062779109

But multiplication by G2_cofactor is very costfull.
mcl supports faster algorithm.
The code is mulByCofactorBLS12.

The benchmark on Core i7-8700 3.2GHz is as the following:

pairing          1.882Mclk
mapToG2 by org-cofactor   1.600Mclk ; almost one pairing!
mapToG2 by fast-cofactor 451.107Kclk

To use fast-cofactor, call mclBn_setOriginalG2cofactor(0) once after initialization.
But there is a problem.

fast-cofactor = org-cofactor * X, where X is a fixed constant.

So The value of the map-to-function changes.
Is it acceptable?

If we could not change the parameter, then I'll try to find the other solution.

I think this would be an acceptable trade off, at least in Prysm MapToG2 is a bottleneck with the amount of serialized signatures we process. As long it maps to the same g2 point, this would be a great improvement. When you say

So The value of the map-to-function changes.

Which function are you referring to ?

As cofactorG2 changes, the value of MapToPoint also changes.

That wouldn't work then, we would have to look at another solution

I see. I'll try the other way of using fast-cofactor without changing values.

That wouldn't work then

What do you mean? What won't do well? The mathematical property of two cofactors is equivalent to each other.

By the way, how can I exec bls_benchmark_test.go with bazel?
I'm sorry that I'm not good at bazel.

That wouldn't work then

What do you mean? What won't do well? The mathematical property of two cofactors is equivalent to each other.

If the value of MapToG2 changes from using a different co-factor, this would lead to a different G2 point, if I am understanding the above proposal correctly. Wouldnt this break compability with the eth2 spec ?

By the way, how can I exec bls_benchmark_test.go with bazel?
I'm sorry that I'm not good at bazel.

you can do to run the whole package test

bazel test //shared/bls/... 

if you want to run specific tests

bazel test //shared/bls/... --test_filter=NameOfTest

this would lead to a different G2 point, if I am understanding the above proposal correctly. Wouldnt this break compability with the eth2 spec ?

I see. I understand that the eth2 spec can never be changed.

you can do to run the whole package test

bazel test //shared/bls/...

Yes, I do it, but it outputs only PASSED information.
I want to know the result of benchmark. Where should I see?

I see. I understand that the eth2 spec can never be changed.

We can actually bring this up in the specs repo over here https://github.com/ethereum/eth2.0-specs/issues . The BLS spec for eth2 is not final, so we can still suggest optimizations to the spec. If
using a faster co-factor leads to large performance gains I think it would be worth it to bring it up for discussion from the research team.

Yes, I do it, but it outputs only PASSED information.
I want to know the result of benchmark. Where should I see?

Ah k, sorry there was a bug in our benchmark tests , but I pushed up changes to fix it here prysmaticlabs/prysm@189a5f1 .

You can run the benchmarks with this

bazel run //shared/bls:go_benchmark_test -- -test.bench=.

You should get the benchmark results outputted.

there was a bug in our benchmark tests, but I pushed up changes to fix it

Thank you, I could.

I think it would be worth it to bring it up for discussion from the research team.

I see. I'm trying to get performance without modifying the spec. If it would not do well, then could you discuss it?

The version using faster mul-g2-cofactor seems to run well.
herumi/bls@71264c7

// before
goos: linux
goarch: amd64
BenchmarkPairing-112                                1299            777189 ns/op               0 B/op          0 allocs/op
BenchmarkSignature_Verify-112                        595           1838702 ns/op             672 B/op         10 allocs/op
BenchmarkSignature_VerifyAggregate-112               673           1789865 ns/op            2824 B/op         16 allocs/op
PASS

// after
goarch: amd64
BenchmarkPairing-112                                1526            779828 ns/op               0 B/op          0 allocs/op
BenchmarkSignature_Verify-112                        745           1418563 ns/op             672 B/op         10 allocs/op
BenchmarkSignature_VerifyAggregate-112               868           1380366 ns/op            2824 B/op         16 allocs/op
PASS

I have a question.
MapToG2 function is used for the spec test. I can't chose the faster algorithm with keeping the value. bls.Sign() and bls.Verify() avoid calling the function directly at the above optimization. I hope that MapToG2 is not called in your application. Is it okay?

hey @herumi the MapToG2 isn't called directly in any of our code. We just use it since the spectests require it for the msg_hash_compressed_test. We dont need to keep the old co-factor in that case, since all the other tests still pass. I can just skip this test since it isnt critical in terms of maintaining compatibility between different clients.

However for the benchmarks I am not getting your after number. This is what I get

Before
goos: linux
goarch: amd64
BenchmarkPairing-8                          1226           1009807 ns/op               0 B/op          0 allocs/op
BenchmarkSignature_Verify-8                  616           1967879 ns/op             672 B/op         10 allocs/op
BenchmarkSignature_VerifyAggregate-8         584           1995572 ns/op            2824 B/op         16 allocs/op

After
goos: linux
goarch: amd64
BenchmarkPairing-8                          1195           1058972 ns/op               0 B/op          0 allocs/op
BenchmarkSignature_Verify-8                  608           1955010 ns/op             672 B/op         10 allocs/op
BenchmarkSignature_VerifyAggregate-8         600           1987850 ns/op            2824 B/op         16 allocs/op

It seems like there is very minor/no difference.

This is what I am doing, does it seem right to you ?

// Init --
// call this function before calling all the other operations
// this function is not thread safe
func Init(curve int) error {
	err := C.blsInit(C.int(curve), C.MCLBN_COMPILED_TIME_VAR)
	if err != 0 {
		return fmt.Errorf("ERR Init curve=%d", curve)
	}
	_ = C.mclBn_setOriginalG2cofactor(0)
	return nil
}

We dont need to keep the old co-factor in that case, since all the other tests still pass. I can just skip this test since it isnt critical in terms of maintaining compatibility between different clients.

I see.

However for the benchmarks I am not getting your after number. This is what I get

I've appended some benchmarks to bls_test.go.

Could you test it for the current binary and a8fdf33 branch?
I got the following results on Core i7-7700.

// the current
bls-eth-go-binary/bls% go test --bench .
goos: linux
goarch: amd64
pkg: github.com/herumi/bls-eth-go-binary/bls
BenchmarkPairing-4                          2000            720058 ns/op
BenchmarkSignHash-4                         5000            367409 ns/op
BenchmarkVerifyHash-4                       2000           1186123 ns/op
BenchmarkVerifyAggreageteHash-4               50          29594982 ns/op

// copy the file
cp bls_test.go a_test.go
// move to old branch
git co a8fdf33
// remove old file
rm bls_test.go
go test --bench .

bls% go test --bench .
BenchmarkPairing-4                          2000            716324 ns/op
BenchmarkSignHash-4                         2000            778160 ns/op
BenchmarkVerifyHash-4                       1000           1562569 ns/op
BenchmarkVerifyAggreageteHash-4               30          49779125 ns/op

Old Branch

goos: linux
goarch: amd64
pkg: github.com/nisdas/bls-go-binary/bls
BenchmarkPairing-8                	    1000	   1164712 ns/op
BenchmarkSignHash-8               	    2000	   1121390 ns/op
BenchmarkVerifyHash-8             	     500	   2472488 ns/op
BenchmarkVerifyAggreageteHash-8   	      20	  72687441 ns/op

Current Branch

goos: linux
goarch: amd64
pkg: github.com/nisdas/bls-go-binary/bls
BenchmarkPairing-8                	    2000	   1167275 ns/op
BenchmarkSignHash-8               	    3000	    610272 ns/op
BenchmarkVerifyHash-8             	    1000	   1965245 ns/op
BenchmarkVerifyAggreageteHash-8   	      30	  45725984 ns/op
PASS

Ok now the number seems better. These are the benchmarks in prysm

Before

goos: linux
goarch: amd64
BenchmarkPairing-8                          1238           1119289 ns/op               0 B/op          0 allocs/op
BenchmarkSignature_Verify-8                  488           2518015 ns/op             672 B/op         10 allocs/op
BenchmarkSignature_VerifyAggregate-8         438           2708306 ns/op            2824 B/op         16 allocs/op

After

goos: linux
goarch: amd64
BenchmarkPairing-8                          1194           1080338 ns/op               0 B/op          0 allocs/op
BenchmarkSignature_Verify-8                  679           1944597 ns/op             672 B/op         10 allocs/op
BenchmarkSignature_VerifyAggregate-8         650           1861219 ns/op            2824 B/op         16 allocs/op

It seems much better now, I might have made a mistake previously when running the benchmarks it seems like, sorry for that.

the specs on my machine are
Intel i7-8550U with 8 cores at 1.80GHz each,16gb of RAM

hey @herumi opened up this issue in the eth2 repo
ethereum/consensus-specs#1450

Thank you for the benchmark. It looks good, and it means that changing the G2 cofactor is not necessary. I'll explain it to the issue later.

I added Linux/macOS/Windows binary.

Thanks ! This is a great improvement from before. I will try to get the pr merged in prysm soon

@herumi, would it be difficult to get the static library for arm64 too ?

Is it the same one in the https://github.com/herumi/bls-go-binary? Then, I can append it.

Yeap, should be the same one

I pushed android/arm64-v8a/libbls384_256.a .

ok great, thanks 👍