zeh/prando

It is easy to bring generator into bad state by providing bad seed

Closed this issue · 1 comments

When using Prando in tests as a source of reproducible randomness, it is tempting to initialize it with some simple seeds such as '', '42' or '43'.

I'm noticing a number of problems when doing so.

First, when the seed is empty string, Prando seems to get into some stuck state and keeps returning the same number (0.5000000001164153) again and again.

Repro:

it('not eq', () => {
  const p = new Prando('');
  expect(p.next()).not.toEqual(p.next());
});

Second, when using two different seed strings '42', and '43', one could expect the two generated sequences to be completely different (avalanche effect), however they look quite similar (at least first produced number).

I understand that Prando is not designed to provide cryptographical level of randomness, but I expected it to provide at least "random enough" numbers to do basic statistics.

I guess what I expected could be done by calling some cryptographic hash function on the seed before passing it into Prando, buy I'm curious why Prando would not do this internally already.

zeh commented

Hey @burdakovd,

(Sorry, for some reason I missed this issue report before, and only saw it now)

Thanks for your report.

  1. You're right, empty strings are producing an invalid internal seed (0). The same was true of a passed seed of value 0; they cannot be shifted so the generator gets stuck. I've fixed this issue. Seed generation is a bit safer now, recovering from cases like that (simply setting the seed to 1).
  2. I actually agree with you on the string seed issue. Strings with a very close alphanumeric similarity like "41" and "42" would produce a close hash, because it was just a char code count. I didn't want to employ any big third-party hashing algorithm because I wanted to make the library small, but it's clear the current solution was not good enough in real world applications. I've added a change that now passes the string-based hash through the internal xorshifter function, therefore making the string hash result more random-looking. This is a breaking change since the same string seeds now produce different internal seeds.

Both changes are now part of the 4.0.0 release, a major bump due to the breaking changes.

Thanks again!