ECDSA for secp256k1

Question

ECDSA for secp256k1

hannesm opened this issue 2 years ago · 8 comments

Hey,

first of all thanks for your great project. I use the generated P256r1, P224r1, P384r1 and P521r1 with additional C code (point_double and point_add from BoringSSL, plus the inversion template from this repository) to have ECDSA and ECDH for these curves. This is great.

Three questions:

Are the point operations (https://github.com/mirage/mirage-crypto/blob/v0.10.7/ec/native/point_operations.h) in scope for the Coq formalization, and could they be generated (e.g. by word_by_word_montgomery point_operations?
I know that the underlying primitives for secp256k1 etc. can be generated, but is there code (either here or elsewhere that you're aware of) to use the generated primitives with full ECDSA? I tried the point_operations above, but that doesn't pass any tests.
Same as 1 for the 25519 code, I needed to integrate some (unproven) C code from BoringSSL to complete Ed25519 and X25519 -- any plans to have these bits extracted?

Best, and thanks again for your impressive work,

Hannes

Answer 1 · 2022-10-18T18:04:01.000Z

Hi,

We aspire to provide both point operations and entire signature algorithms with integrated proofs, but we're not there yet, and future is uncertain. The main technical challenge is the state of our tooling for translating functional code for higher-level operations into memory-manipulating C-like code. The solution we used for field arithmetic inlined functions and unrolled all loops, which is not sustainable. Instead, we've experimented with using rupicola and created a proof-of-concept implementation of X25519 for 32-bit platforms here. Doing the same for other functionality seems feasible, but a bit daunting until technical debt uncovered during the latest experiment is dealt with: rupicola's duplication between function specifications and compilation lemmas is a big one, and our own scaffolding connecting field-operation-generation to rupicola could use a hindsight-informed rewrite as well. This is delicate work that'd likely require dedicated attention of myself, Jade, or Clement, or perhaps all of us, so I'm guessing it won't happen in 2022.

Regardless, I imagine a 64-bit port of the current X25519 implementation would be feasible and not that bad even without the cleanup. Ed25519 ScalarMultBase is currently in the works but going slowly due to the same issues. We haven't looked into signature verification (double-scalar multiplication) or ECDSA yet, but I see the appeal.

As for secp256k1 specifically, I believe the formulas you linked don't work because they assume a=-3. There are no a=0 formulas in fiat-crypto right now, but there are generic ones and I read ecckiila readme discusses secp256k1 and fiat-crypto; alternatively you could find suitable formulas from libsecp256k1 or the EFD. (We're planning on switching to libsecp256k1-style field arithmetic for that prime anyway). I realize having separately verified field arithmetic and point formulas is less compelling than integrated verification, but the montgomery-form field implementation we generate right now doesn't have delicate requirements on how it should be used so I'd expect it to just work when plugged into appropriate formulas and the usual ECDSA code.

I hope this helps. Please let us know how it goes; we're excited to see more of our stuff getting into MirageOS. And in case you feel up to wrestling some Coq, the a=0 formulas or 64-bit X25519 integration would be very welcome as a contribution to fiat-crypto, and probably work out quite similarly to the existing a=-3 and 32-bit cases.

Best,
Andres

Answer 2 · 2022-10-18T20:35:13.000Z

Thanks Andres, I'll look into your pointers. :) (though, due to earlier experience I may not do any Coq proofs this year, neither next year).

A fourth question I'm curious about is the Ed448 -- any chance you know of a location where to borrow the remaining code (I suspect similar to 25519) for Ed448/X448? [plus last time I checked this curve is 64 bit only with fiat atm?]

Answer 3 · 2022-10-18T23:11:12.000Z

I'm not sure whether there are fiat-crypto users who use Ed448 but if I had to implement it I'd look in https://sourceforge.net/p/ed448goldilocks/code/ci/master/tree/src/per_curve/ . I think https://github.com/mit-plv/fiat-crypto/blob/0f61b56dda31384cbeb76f14b03b522d21e27b26/fiat-c/src/p448_solinas_32.c is a 32-bit fiat-crypto output for this curve.

Answer 4 · 2022-11-01T13:02:12.000Z

We've been working on a tool-chain for Curve operations using bedrock:
https://github.com/AU-COBRA/AUCurves

Answer 5 · 2023-06-07T14:08:50.000Z

We're planning on switching to libsecp256k1-style field arithmetic for that prime anyway

May I ask why? The current implementations produce more than adequate performance (in line with other implementations), and the lack of sharp edges is kind of nice.

I realize having separately verified field arithmetic and point formulas is less compelling than integrated verification, but the montgomery-form field implementation we generate right now doesn't have delicate requirements on how it should be used so I'd expect it to just work when plugged into appropriate formulas and the usual ECDSA code.

For what it's worth, this was the case, at least for me (https://github.com/Yawning/secp256k1-voi).

Some notes:

With what is available from fiat right now, using projective coordinates with the complete formulas is likely optimal performance wise (as opposed to incomplete formulas with jacobian coordinates a la libsecp256k1). Note, that I personally am in the "take the performance hit, and use the complete formulas anyway, because it simplifies things" camp. However, adding a "Multiply-by-small-integer" special case would also accelerate the complete formulas.
Porting the Zig/C Bernstein-Yang inversion templates to Go, produced a working inversion routine that was 2-3x slower than a traditional addition-chain based one. I suspect this is another case of the go compiler not optimizing things all that well.

Things that would be cool to get from fiat for this curve:

Multiply by small int (for the field), where the "small" I care about is 21 (3*b, while users of Jacobian coordinates care about 2, 3, 4, 8).
On the scalar side, a formally verified split for GLV decomposition would be nice.
While addchain exists and produces an adqeuate high-assurance routine, a field-element sqrt operation also would be nice.

All in all, with the combination of fiat, the complete formulas for add/double, and addchain, the implementation basically wrote itself.

Answer 6 · 2023-06-09T10:21:56.000Z

Thank you for the report!

I have been explained the strategy behind constant-time select on top of Jacobian formulas and proving the functional correctness of that should be feasible with the tooling as it stands now.
Yeah, I believe the Bernstein-Yang inversion code would be sensitive to compiler details. I'm not sure there's much we can do about it, and I don't consider investigating it a priority right now.
I think mul-by-small is straightforward enough that I'll try to check it off when I look at this code again (which might be August).
I don't understand GLV decomposition well enough to say how hard it would be.
Thank you for the pointer to addchain, we may use it!

Answer 7 · 2023-06-09T12:10:14.000Z

Quick notes:

Inversion is too slow now because our faster jumpdivstep-based algorithm is not in the repository yet. One can get a sense of the performance speedup in Table 3 of https://eprint.iacr.org/2021/549.pdf (spoiler: more than 10x difference)
One can use addchain to compute an addition chain for square root too when p = 3 mod 4 for sparse p, since it consists of an exponentiation by (p+1)/4.
A starting point for GLV verification is https://pastel.archives-ouvertes.fr/tel-01563979/document

Answer 8 · 2023-06-09T12:23:04.000Z

Quick notes:

Inversion is too slow now because our faster jumpdivstep-based algorithm is not in the repository yet. One can get a sense of the performance speedup in Table 3 of https://eprint.iacr.org/2021/549.pdf (spoiler: more than 10x difference)

Ah ok. I'll be waiting for that then, thanks!

One can use addchain to compute an addition chain for square root too when p = 3 mod 4 for sparse p, since it consists of an exponentiation by (p+1)/4.

Yep. That's what I ended up doing (https://github.com/Yawning/secp256k1-voi/blob/moon/internal/field/field_sqrt.go).

I'm not sure how much more effort I'll put into optimizing my current library, since it's performance competitive with other implementations.