Reproducibility: software experiments with damn simple associativity and deep variability
Do you agree that x+(y+z) == (x+y)+z
?
Well, let's see...
Here are the current implementations:
- configurable Python implementation with
seed
andnumber
- configurable Java implementation with
basic
(float),double
ormath
- configurable C implementation with
custom
(optional) +windows
orlinux
having an effect on random primitives. We compile withgcc
,i686-w64-mingw32-gcc
(when needs be/possible), andclang
- configurable Rust implementation with compile-time options (associativity, multiplication inverse with and without Pi) and run-time options with optional error margin over equality
- LISP implementation
- configurable JavaScript implementation with
seed
number (and actually the surprisingglobal seed
) andequality-check
(associativity, multiplication inverse with and without Pi) - configurable Bash implementation with
equality-check
using-e
(associativity, multiplication inverse with and without Pi) - configurable Swift implementation with
seed
,--number
, and--equality-check
- configurable Ocaml implementation with
seed
(optional),--number
, and--equality-check
- configurable Julia implementation with
seed
(optional),--number
,--equality-check
, andstric-equality
- configurable R implementation with
seed
(optional),number
, andeq_check
- configurable Go implementation with
seed
(optional),number
, andequality-check
- configurable Perl implementation with
seed
(optional),number
, andequality-check
All implementations (but LISP until now) support parameterization of the number of random generations. Executions are repeated 10 times by default (min, max, average, std reported).
To execute all variants and gathered results into a CSV: export WINEDEBUG=-all; python eval.py > results.csv; # do something with data like rich results.csv
(note: eval.sh
is deprecated and replaced by eval.py
)
It's also possible to perform a kind of metamorphic testing across variants (see multi_testing.py
).
By metamorphic testing, we mean here checking the two following (metamorphic) relations:
- (MR1) whenever there is a triplet
x, y, z
that fails to hold the equality (e.g., associativity) for a given variant (e.g., Python), this triplet should also fail for another variant (e.g., JavaScript) - (MR2) whenever there is a triplet
x, y, z
that succeeds to hold the equality (e.g., associativity) for a given variant (e.g., Python), this triplet should also succeed for another variant (e.g., JavaScript)
At the moment, we have extended the Python variants and JavaScript variants in such a way both support --check-case
(for verifying a triplet w.r.t. an equality relation) and --failing-cases
(resp. --success-cases
) for synthesizing a set of triplets that fail (resp. succeed) to respect the equality relation (associativity, multiplication inverse, multiplication inverse with Pi).
Hence, we can envision four scenarios:
- the failing cases as generated by Python are also failing in JavaScript
- the failing cases as generated by JavaScript are also failing in Python
- the success cases as generated by Python are also success in JavaScript
- the success cases as generated by Python are also success in JavaScript (cases are triplets)
https://lemire.me/blog/2019/03/12/multiplying-by-the-inverse-is-not-the-same-as-the-division/
https://en.wikipedia.org/wiki/Linear_congruential_generator (pseudo-random generator)
https://users.rust-lang.org/t/why-are-float-equality-comparsions-allowed/76603 about float equals stuffs Clippy lints https://rust-lang.github.io/rust-clippy/master/index.html#float_cmp
https://gist.github.com/garandria/0e965d7a4efff89ed245d71f0c3785a3 https://stackoverflow.com/questions/11006798/how-can-i-obtain-a-negative-random-integer-in-common-lisp
https://simplecxx.github.io/2018/11/03/seed-mt19937.html (about random seed) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87951 (about enum, switch, return types... clang vs g++)
https://docs.julialang.org/en/v1/base/math/#Mathematical-Functions and the famous ≈
operator (isapprox
)
cpan install Getopt::Long enum
https://esolangs.org/wiki/Random_Brainfuck https://esolangs.org/wiki/Brainfuck_algorithms#x_.3D_pseudo-random_number https://twitter.com/acherm/status/1634238174879703040
Is x+(y+z) == (x+y)+z
true in Scratch?
Well, it depends on the upper bound used when randomly generating a value for y
(and x
and z
) considering the example in scratch/testassoc.sb3
.
You can import in Scratch using https://scratch.mit.edu/projects/editor/. There are surprising results, considering variations over the y
upperbound:
- with value
100000000000000000000000000000000000
,ncorrect = ~730
; - with the value:
1e53
,ncorrect = 1000
(100%), perfect! ; - with (large) values in-between (play with the slider!), almost perfect (999 or 997 out of 1000) but not perfect...
- with specific value
1000000000000000000000000
,ncorrect=1000
(out of 1000), so 100% (perfect).
note: for Scratch, it's hard to build a generator and systematize the exploration... It's at the moment mostly for exploring what's going on and hopefully find a comprehensive explanation.
To cross-compile for Windows from Linux with i686-w64-mingw32-gcc
, specific packages are needed (e.g. on Fedora mingw64-gcc.x86_64
).
The combinatorial is roughly (but in fact there are much more variation points and variants):
gcc -o testassoc-l testassoc.c
gcc -o testassoc-lc testassoc.c -DCUSTOM
i686-w64-mingw32-gcc -o testassoc-w testassoc.c -DWIN
i686-w64-mingw32-gcc -o testassoc-wc testassoc.c -DWIN -DCUSTOM