/reproducibility-associativity

Reproducibility: software experiments with damn simple associativity and deep variability

Primary LanguagePython

Reproducibility, associativity, and deep variability

Reproducibility: software experiments with damn simple associativity and deep variability

Do you agree that x+(y+z) == (x+y)+z? Well, let's see...

Here are the current implementations:

  • configurable Python implementation with seed and number
  • configurable Java implementation with basic (float), double or math
  • configurable C implementation with custom (optional) + windows or linux having an effect on random primitives. We compile with gcc, i686-w64-mingw32-gcc (when needs be/possible), and clang
  • configurable Rust implementation with compile-time options (associativity, multiplication inverse with and without Pi) and run-time options with optional error margin over equality
  • LISP implementation
  • configurable JavaScript implementation with seed number (and actually the surprising global seed) and equality-check (associativity, multiplication inverse with and without Pi)
  • configurable Bash implementation with equality-check using -e (associativity, multiplication inverse with and without Pi)
  • configurable Swift implementation with seed, --number, and --equality-check
  • configurable Ocaml implementation with seed (optional), --number, and --equality-check
  • configurable Julia implementation with seed (optional), --number, --equality-check, and stric-equality
  • configurable R implementation with seed (optional), number, and eq_check
  • configurable Go implementation with seed (optional), number, and equality-check
  • configurable Perl implementation with seed (optional), number, and equality-check

All implementations (but LISP until now) support parameterization of the number of random generations. Executions are repeated 10 times by default (min, max, average, std reported).

To execute all variants and gathered results into a CSV: export WINEDEBUG=-all; python eval.py > results.csv; # do something with data like rich results.csv (note: eval.sh is deprecated and replaced by eval.py)

(Meta|Multi)morphic testing

It's also possible to perform a kind of metamorphic testing across variants (see multi_testing.py). By metamorphic testing, we mean here checking the two following (metamorphic) relations:

  • (MR1) whenever there is a triplet x, y, z that fails to hold the equality (e.g., associativity) for a given variant (e.g., Python), this triplet should also fail for another variant (e.g., JavaScript)
  • (MR2) whenever there is a triplet x, y, z that succeeds to hold the equality (e.g., associativity) for a given variant (e.g., Python), this triplet should also succeed for another variant (e.g., JavaScript)

At the moment, we have extended the Python variants and JavaScript variants in such a way both support --check-case (for verifying a triplet w.r.t. an equality relation) and --failing-cases (resp. --success-cases) for synthesizing a set of triplets that fail (resp. succeed) to respect the equality relation (associativity, multiplication inverse, multiplication inverse with Pi). Hence, we can envision four scenarios:

  • the failing cases as generated by Python are also failing in JavaScript
  • the failing cases as generated by JavaScript are also failing in Python
  • the success cases as generated by Python are also success in JavaScript
  • the success cases as generated by Python are also success in JavaScript (cases are triplets)

Resources

General

https://lemire.me/blog/2019/03/12/multiplying-by-the-inverse-is-not-the-same-as-the-division/

https://en.wikipedia.org/wiki/Linear_congruential_generator (pseudo-random generator)

Rust

https://users.rust-lang.org/t/why-are-float-equality-comparsions-allowed/76603 about float equals stuffs Clippy lints https://rust-lang.github.io/rust-clippy/master/index.html#float_cmp

LISP

https://gist.github.com/garandria/0e965d7a4efff89ed245d71f0c3785a3 https://stackoverflow.com/questions/11006798/how-can-i-obtain-a-negative-random-integer-in-common-lisp

C++

https://simplecxx.github.io/2018/11/03/seed-mt19937.html (about random seed) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87951 (about enum, switch, return types... clang vs g++)

Julia

https://docs.julialang.org/en/v1/base/math/#Mathematical-Functions and the famous operator (isapprox)

Perl

cpan install Getopt::Long enum

Brainfuck

https://esolangs.org/wiki/Random_Brainfuck https://esolangs.org/wiki/Brainfuck_algorithms#x_.3D_pseudo-random_number https://twitter.com/acherm/status/1634238174879703040

Scratch

Is x+(y+z) == (x+y)+z true in Scratch? Well, it depends on the upper bound used when randomly generating a value for y (and x and z) considering the example in scratch/testassoc.sb3. You can import in Scratch using https://scratch.mit.edu/projects/editor/. There are surprising results, considering variations over the y upperbound:

  • with value 100000000000000000000000000000000000, ncorrect = ~730;
  • with the value: 1e53, ncorrect = 1000 (100%), perfect! ;
  • with (large) values in-between (play with the slider!), almost perfect (999 or 997 out of 1000) but not perfect...
  • with specific value 1000000000000000000000000, ncorrect=1000 (out of 1000), so 100% (perfect).

note: for Scratch, it's hard to build a generator and systematize the exploration... It's at the moment mostly for exploring what's going on and hopefully find a comprehensive explanation.

C

To cross-compile for Windows from Linux with i686-w64-mingw32-gcc, specific packages are needed (e.g. on Fedora mingw64-gcc.x86_64). The combinatorial is roughly (but in fact there are much more variation points and variants):

gcc -o testassoc-l testassoc.c
gcc -o testassoc-lc testassoc.c -DCUSTOM
i686-w64-mingw32-gcc -o testassoc-w testassoc.c -DWIN
i686-w64-mingw32-gcc -o testassoc-wc testassoc.c -DWIN -DCUSTOM