mit-plv/fiat-crypto

Unsaturated Solinas Heuristics infer too large a number of limbs for p521 x32

JasonGross opened this issue · 1 comments

./src/ExtractionOCaml/unsaturated_solinas 'p521' 32 '(auto)' '2^521 - 1' suggests 19 limbs, but ./src/ExtractionOCaml/unsaturated_solinas 'p521' 32 18 '2^521 - 1' succeeds. I thought the overflow check at

(** check if the suggested number of limbs will overflow
double-width registers when adding partial products after a
multiplication and then doing solinas reduction *)
Definition overflow_free : bool
:= let v := squaremod weight s c n loose_upperbounds in
forallb (fun k => Z.log2 k <? 2 * machine_wordsize) v.
Definition is_goldilocks : bool
:= match c with
| (w, v) :: _
=> (s =? 2^Z.log2 s)
&& (w =? 2^Z.log2 w)
&& (Z.log2 s =? 2 * Z.log2 w)
| nil => false
end.
End __.
#[global]
Hint Rewrite @length_distribute_balance @length_distribute_balance_step @length_balance @length_prime_upperbound_list @length_tight_upperbounds @length_loose_upperbounds @length_tight_bounds @length_loose_bounds : distr_length.
Inductive MaybeLimbCount := NumLimbs (n : nat) | Auto (idx : nat).
Section ___.
Context {tight_upperbound_fraction : tight_upperbound_fraction_opt}
(s : Z)
(c : list (Z * Z))
(machine_wordsize : Z).
(** given a parsed prime, pick out all plausible numbers of (unsaturated) limbs *)
(** an unsaturated implementation will necessarily need at least as many limbs
as a saturated one, so search starting there *)
Let num_bits_p := Z.log2_up s.
Let nlimbs_saturated := Z.to_nat (Qceiling (num_bits_p / machine_wordsize)).
Let min_limbs := nlimbs_saturated.
(* don't search past 2x as many limbs as saturated representation; that's just wasteful *)
Let result := filter (fun n => overflow_free n s c machine_wordsize) (seq min_limbs min_limbs).
Definition get_possible_limbs : list nat
:= result.
Definition get_num_limbs (n : MaybeLimbCount) : option nat
:= match n with
| NumLimbs n => Some n
| Auto idx => nth_error get_possible_limbs idx
end.
End ___.
was tight?

If I compute @get_num_limbs 1 (2^521) [(1,1)] 32 (Auto 0), the loose upperbound list for 18 is [1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 1610612736; 805306368] (hex: [0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x60000000, 0x30000000]), squaremod of this is [85604421717058387968; 83010348331692982272; 80416274946327576576; 77822201560962170880; 75228128175596765184; 72634054790231359488; 70039981404865953792; 67445908019500548096; 64851834634135142400; 62257761248769736704; 59663687863404331008; 57069614478038925312; 54475541092673519616; 51881467707308113920; 49287394321942708224; 46693320936577302528; 45396284243894599680; 44099247551211896832]. The Z.log2 of these are [66; 66; 66; 66; 66; 65; 65; 65; 65; 65; 65; 65; 65; 65; 65; 65; 65; 65]. This is off from 64 = 2*32, so I'm confused how 18 limbs manages to boundscheck...

cc @jadephilipoom @andres-erbsen do you know what's going on here?

Ah, nevermind, synthesizing with 18 limbs results in use of 128-bit integers