CheNgu12 enumeration fit data

Question

CheNgu12 enumeration fit data

Opened this issue 2 years ago · 11 comments

In the source for CheNgu12 enumeration cost model it is mentioned that the formula for the exponent in the SVP cost was fitted to data from the Table 4 of the BKZ 2.0 paper [CheNgu12]. However inspection of [CheNgu12] shows that the numbers used for the fit are higher than those provided in the paper for linear pruning. It would instead seem to be the case that the numbers used for the fit are taken from Table 5.2 (page 119) of Yuanmi Chen's thesis.

This may be a documentation bug. It is also interesting that the extreme pruning numbers in the thesis are higher than the linear pruning ones in [CheNgu12], maybe a re-fit to those would be worth it?

Answer 1 · 2022-09-22T15:38:21.000Z

For added context, I run the enumeration fit code through the CheNgu12 data points to obtain slightly different exponents:

sage: print("Yuanmi Chen 13 extreme") 
....: dim = [100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250] 
....: nodes = [39.0, 44.0, 49.0, 54.0, 60.0, 66.0, 72.0, 78.0, 84.0, 96.0, 99.0, 105.0, 111.0, 120.0, 127.0, 134.0]  # noqa 
....: times = [c + log(200,2).n() for c in nodes] 
....: T = list(zip(dim, nodes)) 
....: var("a,b,c,beta") 
....: f = a*beta*log(beta, 2.0) + b*beta + c 
....: f = f.function(beta) 
....: f.subs(find_fit(T, f, solution_dict=True)) 
....:  
....: print("CheNgu12 linear") 
....: dim = [100, 120, 140, 160, 180, 200, 220, 240, 280, 380] 
....: nodes = [33.6, 44.5, 56.1, 68.2, 80.7, 93.7, 107.0, 120.6, 148.8, 223.5] # noqa 
....: times = [c + log(200,2).n() for c in nodes] 
....: T = list(zip(dim, nodes)) 
....: var("a,b,c,beta") 
....: f = a*beta*log(beta, 2.0) + b*beta + c 
....: f = f.function(beta) 
....: f.subs(find_fit(T, f, solution_dict=True)) 
....:  
....: print("CheNgu12 extreme") 
....: dim = [100, 120, 140, 160, 180, 200, 220, 240, 280, 380] 
....: nodes = [9, 15, 21.7, 28.8, 36.4, 44.4, 52.8, 61.5, 79.8, 129.9] 
....: times = [c + log(200,2).n() for c in nodes] 
....: T = list(zip(dim, nodes)) 
....: var("a,b,c,beta") 
....: f = a*beta*log(beta, 2.0) + b*beta + c 
....: f = f.function(beta) 
....: f.subs(find_fit(T, f, solution_dict=True))

Output:

Yuanmi Chen 13 extreme
(a, b, c, beta)
beta |--> 0.270188776350190*beta*log(beta) - 1.0192050451318417*beta + 16.10253135200765
CheNgu12 linear
(a, b, c, beta)
beta |--> 0.181790170093418*beta*log(beta) - 0.4882442914735114*beta - 1.3146955691545792
CheNgu12 extreme
(a, b, c, beta)
beta |--> 0.182571394958456*beta*log(beta) - 0.7398311715129441*beta - 1.0833549677345786

Answer 2 · 2022-09-22T15:43:19.000Z

Okay, looks like we should update to Chen 13 extreme, right?

Answer 3 · 2022-09-22T16:06:09.000Z

Yep, unless we believe the CheNgu12 params are better? They predict faster enumeration, not sure if independent experimental data agrees with it.

Answer 4 · 2022-09-22T16:32:09.000Z

Chen13 seems to match what we got for FPLLL: ~= 1/2e*beta*log(beta) - beta + 16

Answer 5 · 2022-09-23T09:20:42.000Z

Chen13 seems to match what we got for FPLLL: ~= 1/2e*beta*log(beta) - beta + 16

But it's a super-exponential factor off, no? It's quite a big difference at, say, beta = 500.

sage: def chen13(beta):
....          # From Fernando's numbers
....          return 0.270188776350190*beta*log(beta,2.0) - 1.0192050451318417*beta + 16.10253135200765

sage: def fplll(beta):
....          # These numbers come from [ABFKSW20]
....          return 1/(2*e) * beta * log(beta, 2.0) - 0.995*beta + 16.25

sage: RR(fplll(500))
343.331....
sage:  RR(chen13(500))
717.727....

Or am I missing something?

Answer 6 · 2022-09-23T10:05:02.000Z

log(beta) vs log(beta, 2) :)

Answer 7 · 2022-09-23T10:24:49.000Z

Yep, annoying mathematicians using natural logs everywhere

sage: chen13 = lambda beta:  0.270188776350190*beta*log(beta) - 1.0192050451318417*beta + 16.10253135200765 
....: fplll = lambda beta: 1/(2*e) * beta * log(beta, 2.0) - 0.995*beta + 16.25 
....: RR(fplll(500)) 
....: RR(chen13(500))                                                                                                                                                                                                                                        
343.331928076297
346.058687590423

Edit: out of scruple I checked the curve fit when keeping the decimal points in Chen's thesis (that are rounded away in the estimator's code), keeping them does not make a meaningful difference:

sage: chen13 = lambda beta:  0.270188776350190*beta*log(beta) - 1.0192050451318417*beta + 16.10253135200765  
....: chen13fp = lambda beta: 0.273273176432764*beta*log(beta) - 1.0400174907733366*beta + 16.915273240747165 
....: fplll = lambda beta: 1/(2*e) * beta * log(beta, 2.0) - 0.995*beta + 16.25  
....: RR(fplll(500))  
....: RR(chen13(500)) 
....: RR(chen13fp(500))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
343.331928076297
346.058687590423
346.049375524385

Answer 8 · 2022-09-23T10:26:19.000Z

Annoyingly: should the class be renamed to Chen13, or maybe for backwards compatibility have a Chen13 class and set CheNgu12 as a copy of Chen13 to avoid someone's code breaking? I guess the lattice-estimator has still a moving API

Answer 9 · 2022-09-23T10:30:59.000Z

Sorry for the noise! Could've swore I even checked your code :D

…

On Fri, 23 Sept 2022, 11:25 Fernando Virdia, ***@***.***> wrote: Yep, annoying mathematicians using natural logs everywhere sage: chen13 = lambda beta: 0.270188776350190*beta*log(beta) - 1.0192050451318417*beta + 16.10253135200765 ....: fplll = lambda beta: 1/(2*e) * beta * log(beta, 2.0) - 0.995*beta + 16.25 ....: RR(fplll(500)) ....: RR(chen13(500)) 343.331928076297346.058687590423 — Reply to this email directly, view it on GitHub <#45 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF7CASHHNEJ7DIFQBW72WKDV7WAPXANCNFSM6AAAAAAQTEXJOQ> . You are receiving this because you commented.Message ID: ***@***.***>

Answer 10 · 2022-09-23T10:40:25.000Z

I vote for just renaming it, I don't want to guarantee the API (just yet)

Answer 11 · 2022-09-23T12:39:52.000Z

Sounds fair, for enumeration people should be using newer models anyway I guess