tom-lord/regexp-examples

Randomness regression?

Closed this issue · 2 comments

cabo commented

The CDDL tool uses regex-examples as part of its instance-example generation.

A typical regex might turn up in the CDDL expression

nai = tstr .regexp "\\w+@\\w+(\\.\\w+)+"

A while ago, the CDDL tool generated

"N1@CH57HF.4Znqe0.dYJRN.igjf"

from that. Now I get instances such as

"3l9FP@dHYj37.bcdac.a.a.a.a"
"1caU3X@zJ1.aeeeba"
"UZY@AZU.beecea.a.a"
"jDQFpQ@6iW.ccbbcc.a"
"kdP7X@9jPaW.ccdbc.a"
"Q@4.aabeb.a.a.a.a"
"Hc@8Ts2t.aaccc"
"wK9S@5yMKl.ccaeea.a"
"U1W4WN@wDzHUH.beedca.a.a.a.a.a"
"0Fd8Mn@3FuVVy.adbeda.a.a"

Obviously, this is less satisfying as a set of examples.

Any reason why the entropy vanishes at the end of the RE?

(The same is true with

nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+"

except that in this case the upper case wins:

"p1zG@na.CDABD"
"f9OHe@3kTD4.CCEEDC.A.A"
"hOeIxN@v5h.DAAAB.A.A.A"
"t7q@oEdreG.CCBAC.A"
"to6@HYu.ABADB.A.A.A.A.A"
"B1Qv@ujnEqZ.EBBEA.A.A.A"
"Uf@l4kv.CEEBBC.A.A.A"
"js@6P.BDDE.A.A.A.A"
"r5Ot1K@9c.ADCECB"
"t@GA4.BBCDDB.A.A"

)

Interesting, thanks for the report... I'll take a look into this ASAP. At a guess, I suspect it's got something to do with the max_results feature that was added to v1.2.0 of the gem.

Writing tests to prevent such a regression is tricky (I fixed a more subtle issue back in v1.1.3), but I'll have think about how to improve the suite.

Sorry it took me absolutely ages to get round to fixing this... The project's not dead; I'm just a busy guy 😅

Gem version v1.4.3 is now released with the fix; thanks again for the report 😄