Non-uniform distribution of `randNumber` generation
Malien opened this issue · 2 comments
Is this a regression?
No
Description
Given the following snippet
import { randNumber } from "@ngneat/falso"
const freq = Array(10).fill(0)
for (const n of randNumber({ min: 0, max: 9, length: 10_000 }) {
freq[n]++
}
we can see that the distribution is almost uniform (sample output; run-to-run variance is negligible):
[563, 1041, 1123, 1062, 1069, 1171, 1171, 1113, 1109, 578]
First and last entries are always encountered half as much. No matter, if the number is an integer, a float (via setting fraction
parameter), and does not depend on the range size (min
and max
parameters), probabilities converge to
for a given amount of discrete values (S) with greater number of tests.
It is non-intuitive. This looks like a bug, in otherwise uniform-ish distribution.
Please provide a link to a minimal reproduction of the bug
No response
Please provide the exception or error you saw
No response
Please provide the environment you discovered this bug in
No response
Anything else?
Changing the algo will impact current "hard-seeded" outcomes, as such this might be a breaking change. I'm not saying that this is an important issue, just something I've noted. There are workarounds, such as using built-in (non-seedable) Math.random
, using another lib, or using randNumber
adding one more discrete value, and collapsing first and last outcomes into one.
Do you want to create a pull request?
No
@Malien telling from this line:
https://github.com/ngneat/falso/blob/main/packages/falso/src/lib/core/core.ts#L64C10-L64C66
This is caused by the algorithm itself.
If you run the same test but for the number coming from:
Number((Math.random() * (9 - 0) + 0).toFixed(0))
you get identical result. If we go further and use round
instead of toFixed
:
Number(Math.round((Math.random() * (3 - 1) + 1)))
We again get identical result.
So I think to solve that issue this particular line needs to be changed.
@Malien
Number(Math.floor((Math.random() * (10 - 0) + 0))
(floor and max + 1) produces proper uniform distribution.
This is compliant with MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random#getting_a_random_integer_between_two_values that tells us, that using Math.round()
won't give you uniform distribution but Math.floor
does.
This will fix generating for integer, however I'm not sure - yet - how to properly handle precision.