xorF

xorF, short for xorshift coefficients (coFicients) aims to be a sort of total suit for xorshifts. It will provide tools capable of calculating coefficients for xorshifts of any size and period.
It is currently only working with 32, 64 and 128 bits. But it is stupidly fast, using both avx and concurent multithreading.
I plan to upgrade its capabilities to work with bigger sizes, where the speed will actually be necessary to obtain answer in any reasonnable amout of time.

Quick start

compile and run the project with make, by replacing #ofbits with either 32, 64 or 128 for, repectivily, 32, 64 or 128 bits:

> make N=#ofbits

Other compile targets for make are:

fast : default, compiled with -O2
plain : same as fast but without some optimisation
db : same as plain but with debug info

Which creates a program named xorF_#ofbits.exe in the bin/ folder

With N > 64, you can add SUB_MULT=1 to enable submatrix multiplication, which is faster on my machine.

there is also lut to create lut.exe that is used to create lut_#ofbits.data which contains binary look up tables for the right and left shift matrices

Math

What is the xorshift algorithm

This is only a very short explaination for more details see the paper from Marsaglia or the wikipedia page on the subject.
xorshift is a PRNG (Pseudo Random Number Generator) algorithm which generates integers with good randomness properties while being very efficient.
A typical implementation for xorshift is the following:

intN_t xorshift_N()
{
  static intN_t x = NON_ZERO_SEED_VALUE;
  y ^= y << a;
  y ^= y >> a;
  return (y ^= y << c);
}

where intN_t is an interger type of width N and a, b and c are the values computed by this program.
For such a PRNG to have its maximal period (2^N - 1), carefull choices of a, b and c must be made.

Matrix reprentation

One can represent this serie of xors, left and right shifts with matrices, as addition mod 2 is nothing other that xor.
Consequently, if one represents an N bit integer, y, with a N row tall column vector, Y, we can express the assignement y ^= y << a as Y = Y + Y L^a = Y(I + L^a), where + stands for addition mod 2 and L is the so-called left-shift matrix with a "diagonal" of zeros right below the main diagonal.
In a similar fasion, one can represent y ^= y >> b as Y = Y + R^b = Y (I + R^b).

The whole transformation can then be thought of as:

Y_(n+1) = Y_n (I + L^a)(I + R^b)(I + L^c) = Y_n T

This is not exactly the same for more than 64 bits but the spirit is similar, find matrices than represent easey (and fast) to compute operations on computer words.

actual computation

For the PRNG to have good randomness the matrix T must be non-singular. This can be easily check as for T to be non-singular, it must be null when exponentiated to P = 2^N - 1 or P/p_k, with the p_k's being the prime factors of P. The first preliminary test can be done really fast by computing (T^2)^n (by n fold repeated squaring) and checking if it is equal to T. The second batery of tests (only performed iff the first one is successful) can be done reasonably quickly by binary exponentiation.
Those tests are applied in a brute-force manner to all NxN matrices of the form described above, with 0 <= a, b, c <= N. Every triplet that satify both tests are recorded and are good coefficients for an implementation of the xorshift algorithm.