entropyxyz/crypto-primes

Implement some performance optimizations for the Lucas test

Opened this issue · 2 comments

There are a few tricks that may speed up the Lucas test:

  1. Crandall & Pomerance1, after Algorithm 3.6.7 (Lucas chain), provide some ways to reduce the general (P, Q) Lucas sequence to (P', 1) which may eliminate some multiplications.
  2. Baillie et al2, in Eqs 13-18, give an alternative way to calculate the Lucas sequence to what we use, which generates U_k in addition to V_k, which means we don't need to calculate U_k via modular inverse. It does require more multiplications though. Need to test if it leads to the performance improvement.

Footnotes

  1. R. Crandall, C. Pomerance, "Prime numbers: a computational perspective",
    2nd ed., Springer (2005) (ISBN: 0-387-25282-7, 978-0387-25282-7)

  2. R. Baillie, A. Fiori, S. S. Wagstaff,
    "Strengthening the Baillie-PSW primality test",
    Math. Comp. 90 1931-1955 (2021),
    DOI: 10.1090/mcom/3616

The second item does not seem to be particularly useful; the single modular inverse we perform to check for U = 0 is fast enough, and dividing by 2 in Montgomery space (which is required to employ the Baillie et all approach) is quite inconvenient.

The first item does not seem to be useful for most of the checks we perform, since it seems that we can only switch back from (P', 1) sequence to (P, Q) for the 2d-th element (where n + 1 = d * 2^s), while we need it for the d-th element.

That said, it would work for the Lucas-V check, but in that case we will have to have a separate lucas_test() function specifically for Lucas-V, while now it's just a few lines in the main lucas_test(). If Lucas-V ever becomes the preferred check in presets, it will be worth doing.