This is a Clojure(Script) implementation of the xoroshiro128+ PRNG described at http://xoroshiro.di.unimi.it/
The algorithm has been shown to be fast and produce superior statistical results to many PRNGs shipped with languages, including Java. The statistical results have been verified in both PractRand and TestU01 by the authors.
xoroshiro128+ is designed to be the successor to xorshift128+, currently used in the JavaScript engines of Chrome, Firefox and Safari.
Both xorshift128+ and xoroshiro128+ have a period of 2128 but xoroshiro128+ is benchmarked by the authors as 20% faster and with 20% fewer failures in BigCrush than its predecessor.
Everything is in xoroshiro128.core
.
(require '[xoroshiro128.core])
The simplest usage is to simply call rand
to generate a random long.
This uses an atom to store the state of the prng, which is updated to the next value in sequence on each call to rand
.
rand
is automatically seeded by java.util.Random
or Math.random
, passed through splitmix64, and so should Just Work.
This atom can be manually reset/seeded (e.g. during testing) by calling seed-rand!
and passing it a long. Internally, as per the suggestion in the reference implementation, this long will be passed through a splitmix64 generator to generate the two 64 bit seeds needed for the xoroshiro128+ algorithm. seed-rand!
also accepts two arguments, which should both be longs to pass directly to xoroshiro128+ (no splitmix64 involved).
As of version 1.1.1
, calls to rand
in parallel will use compare-and-set!
to ensure that each value is returned only once across all threads. No guarantees about the ordering of rand
are made for parallel execution.
For example, using pmap
and map
to produce two sequences of random numbers with rand
, starting with the same seed, would:
- create the same set of longs twice
- not order the sequences in the same way,
pmap
will have incorrect ordering - not duplicate any values in
pmap
that aren't duplicated inmap
See xoroshiro128.test.state
for examples.
Both xoroshiro128+ and splitmix64 are implemented as immutable types with a shared protocol. The IPRNG
protocol exposes next
and value
to generate the next item in the sequence or the long representing the current item, respectively.
This protocol allows us to construct a lazy sequence from a seed quite easily with iterate
and map
.
(require '[xoroshiro128.core :as x])
(def seed 12345)
(def my-rand-seq (map x/value (iterate x/next (x/xoroshiro128+ seed))))
(take 5 my-rand-seq)
; (5983371452340661002 -139170902943549277 2600980751997770790 3131701164191746090 -3375623441569470803)
Additionally, we can inspect any item in the sequence to extract the seed, allowing us to resume the sequence from that point later.
(require '[xoroshiro128.core :as x])
(def seed 9876)
(def my-rand-item (-> (x/xoroshiro128+ seed) x/next x/next x/next))
; Peek ahead at my-rand-item + 1
(x/value (x/next my-rand-item))
; 3486300715335445982
; Extract the seed from my-rand-item for later.
(def new-seed (x/seed my-rand-item));
; [-5785456751514194665 7961309068892779353]
; Create a new item from our extracted seed.
(def new-rand-item (apply x/xoroshiro128+ new-seed))
; If we seeded correctly, new-rand-item + 1 should be the same as my-rand-item + 1.
(x/value (x/next new-rand-item))
; 3486300715335445982
Seeds for the xoroshiro128+ algorithm must be 128 bit (as the name implies).
As clojure on the JVM supports 64 bit integers (longs) but not 128 bit integers, the seed is represented internally as two longs in a vector.
The situation is slightly more complex in clojurescript as JavaScript does not provide native support for 64 bit integers, only 64 bit floats. The google.math.Long
class from Google Closure is used to provide 64 bit integer support, and then 128 bit seed support using two longs as per the clojure implementation.
As mentioned above, if only a single long is available to seed the PRNG, the splitmix algorithm can be used to extrapolate further longs to use as a seed. For convenience long->seed128
is provided convert a 64 bit seed into a 128 bit seed. This function is used internally when only a single long is provided.
The authors of Xoroshiro128+ recommend against seeding the PRNG with the output of a similar PRNG.
"research has shown that initialization must be performed with a generator radically different in nature from the one initialized to avoid correlation on similar seeds."
They also mention that similar PRNGs may be used internally by Math.random
in JavaScript, depending on the browser brand and version (this could change at any time):
Albeit superseded by xoroshiro128+, xorshift128+ is presently used in the JavaScript engines of Chrome, Firefox, Safari, Microsoft Edge.
For this reason I recommend against providing the full 128 bits directly from the native PRNG in a web browser - pass 64 bits (1 number from Math.random
) through long->seed128
instead (it uses splitmix64 internally to fix this problem).
It is worth noting that converting a 64 bit seed to a 128 bit seed using long->seed128
is a deterministic process, i.e. any given long always provides the same seed. This means the pool of available seeds is 64 bits, and is not magically increased to 128 bits. Whether this matters or not is entirely contextual, but providing 128 bit seeds will drastically increase the size of the pool of available pseudo random sequences to draw upon.
As UUIDs represent 128 bit integers (hexadecimal), they are also supported for convenience as seed values both via. the xoroshiro128+
function and uuid->seed128
. Note that UUID generation functions leave a few bits of the UUID static to indicate the UUID version and variant. This means the seed pool for a given UUID generation function is both orders of magnitude larger than a single 64 bit seed, and smaller than two independant 64 bit seeds.
The current seed value can be extracted as a 128 bit seed vector from both Xoroshiro128+
and Splitmix64
type data with the seed
function.
The Xoroshiro128+ algorithm supports a jump function to easily create new non-overlapping sequences from any starting point.
Calling the jump
function is equivalent to calling next
264 times.
As the total period of Xoroshiro128+ is 2128 it is possible to call jump 264 times before needing to reseed a totally new sequence.
If you are using simple calls to rand
there is a jump-rand!
function provided to jump the state of rand
.
(require '[xoroshiro128.core :as x])
(def seed 55555)
(def my-rand (x/xoroshiro128+ seed))
; jumped-rand is equivalent to 2^64 calls to (x/next my-rand)
(def jumped-rand (x/jump my-rand))
The outputs of Splitmix64 next value, Xoroshiro128+ next value, and the Xoroshiro128+ jump function have all been verified against samples from the reference C implementation.
Dozens of values were generated from https://ideone.com/PuauK5 and fed directly into the expected outputs for the test suite.
It is a common requirement to convert our random 64 bit integers into a random decimal number between 0 and 1. For example, this is a convenient way to insulate ourselves from goog.math.Long
in cljs as the output of such a conversion will be a native JS float.
Unfortunately the naive approach of simply dividing a generated long by the the maximum long produces a biased, non-uniform distribution.
The author of Xoroshiro128+ and Splitmix64 suggests two options for correctly converting from generated 64 bit integers to a 64 bit float between [0, 1).
Their primary recommendation is available for both clj and cljs as xoroshiro128.core/long->unit-float
.
I recommend using this function if you don't have any specific reason to prefer the alternative.
The alternative recommendation is available for clj only as xoroshiro128.core/long->unit-float:alt
.
Referencing the alternative algorithm, the authors state:
"This technique is extremely fast, but you will be generating half the values you could actually generate."
As far as I'm aware, neither JavaScript nor Google Closure provide a native/fast way to perform the operations required by the alterative algorithm. Even if this can be implemented somehow in JS, I'm dubious whether any speed advantage over the standard algorithm would survive the implementation details (e.g. juggling strings internally). Without this speed advantage the alternative algorithm is clearly inferior, so in my opinion the implementation effort is simply not justified for JavaScript.
All benchmarking code is available under xoroshiro128.test.performance
.
I did some basic benchmarking on my laptop using criterium and found ~66% speed improvement using xoroshiro128+ vs. the default Java PRNG.
As always with benchmarking, YMMV.
I compared (.nextLong (java.util.Random.))
against (xoroshiro128.core/rand)
.
Results from java.util.Random
:
Evaluation count : 831204360 in 60 samples of 13853406 calls.
Execution time mean : 70.719127 ns
Execution time std-deviation : 0.615334 ns
Execution time lower quantile : 69.942200 ns ( 2.5%)
Execution time upper quantile : 72.194872 ns (97.5%)
Overhead used : 1.675261 ns
Found 2 outliers in 60 samples (3.3333 %)
low-severe 2 (3.3333 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Results from xoroshiro128.core/rand
:
Evaluation count : 2301527460 in 60 samples of 38358791 calls.
Execution time mean : 24.654548 ns
Execution time std-deviation : 0.398656 ns
Execution time lower quantile : 24.329306 ns ( 2.5%)
Execution time upper quantile : 25.960526 ns (97.5%)
Overhead used : 1.687464 ns
Found 5 outliers in 60 samples (8.3333 %)
low-severe 2 (3.3333 %)
low-mild 3 (5.0000 %)
Variance from outliers : 2.0241 % Variance is slightly inflated by outliers
Results from (comp xoroshiro128.xoroshiro128/long->unit-float xoroshiro128.state/rand)
Evaluation count : 1925694840 in 60 samples of 32094914 calls.
Execution time mean : 29.564911 ns
Execution time std-deviation : 0.185632 ns
Execution time lower quantile : 29.263758 ns ( 2.5%)
Execution time upper quantile : 29.912975 ns (97.5%)
Overhead used : 1.684063 ns
Found 1 outliers in 60 samples (1.6667 %)
low-severe 1 (1.6667 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Given these results I think it's safe to recommend xoroshiro128+ as a mostly "drop in" replacement for (.nextLong (java.util.Random.))
.
Unsurprisingly the performance of goog.math.Long
is significantly worse across the board than native JavaScript floats.
I'm not aware of any benchmarking tool for CLJS that is as sophisticated as criterium, so I simply ran each function 1 million times and recorded the timestamps with performance.now
.
Results:
LOG: '"benchmarking xoroshiro128.state/rand"'
LOG: '198.56'
LOG: '"benchmarking xoroshiro128.state/rand as xoroshiro128.xoroshiro128/long->unit-float"'
LOG: '3715.5350000000003'
LOG: '"benchmarking xoroshiro128.state/native-rand"'
LOG: '9810.575'
LOG: '"benchmarking Math.random"'
LOG: '20.904999999998836'
These numbers seem to wobble by around +/- 20% on subsequent runs.
We can see that xoroshiro128+ is ~10x slower than Math.random
but it's a bit "apples vs. oranges" as Math.random
produces floats between [0, 1) and xoroshiro128+ produces goog.math.Long
objects across the full 64 bit integer range.
We see ~200ns per call (0.29s for 1 000 000 calls) in CLJS vs. ~25ns per call in CLJ. This puts the JVM at around 10x faster than JS for this particular use-case. This is also another "apples vs. oranges" comparison as the JVM and JS runtime environments are obviously very different.
To get "apples to apples" timings within JS (and to seed rand
) I created a "native random long" function that works exactly like the internals of random-uuid
, but stops halfway to return a single goog.math.Long
instead of a full UUID:
; lifted from https://cljs.github.io/api/cljs.core/random-uuid
(let [hex #(.toString (rand-int 16) 16)]
(goog.math.Long.fromString
(apply str (take 16 (repeatedly hex)))
16))
This approach is ~50x slower than xoroshiro128+ and ~470x slower than Math.random
(due to the string manipulation, I assume).
We can also compare the speed of float generation by passing the output of rand
calls to long->unit-float
against Math.random
. In this case, xoroshiro128+ generates floats ~180 times slower than Math.random
calls.
Overall the use-cases for xoroshiro128+ in CLJS are not as clear cut as CLJ due to lack of native long support.
I recommend xoroshiro128+ when:
- Seeding the PRNG is important
- Working with the full range of 64 bit integers (
goog.math.Long
) is acceptible or even important - Wanting to work against the
xoroshiro128.prng/IPRNG
protocol - Compatibility with another system implementing xoroshiro128+ is important
I recommend Math.random
when working with an unseeded PRNG with an undefined algorithm outputting only floats on [0, 1) is acceptible.
CLJS benchmarks were conducted on Chrome with the :advanced
compiler optimization flag as this should best represent usage in production deployments. Interestingly, advanced compilation made Math.random
calls about 6x slower, and goog.math.Long
based logic ~30-60% faster.
Fair warning that changing the browser and CLJS compilation optimisations level drastically changes the absolute and relative benchmark timings - in some cases by 100% or more.
xoroshiro128+ and the family of related generators are not cryptographically secure or intended for use in cryptography.
These generators are designed to produce a seedable, statistically uniform distribution at high speeds, with a reasonable period.
The xoroshiro128+ algorithm reference implementation in C was developed by David Blackman and Sebastiano Vigna in 2016 under a Creative Commons public domain license https://creativecommons.org/publicdomain/zero/1.0/.
This clojure implementation is copyright © 2016 David Meister.
Distributed under the Eclipse Public License version 1.0.
Thanks to mscharley for helping to verify the outputs of the test suite.