Theseus is an implementation of the SP 800-90B tests.
This project is named after Claude Shannon's mechanical maze-solving mouse .
For general SP 800-90B testing topics, please see Joshua E. Hill's SP 800-90B web page .
This was written with a fairly modern Linux system running on a somewhat modern Intel CPU (it uses BMI 2, SSE 4.2 and RdRand instructions). This is intended to be compiled using either a recent version of gcc (tested using gcc version 10.3.0, or clang (tested using clang version 12.0.1). This uses divsufsort, libbz2 and OpenMP, so these libraries (and their associated include files) must be installed and accessible to the compiler.
The folder structure is as follows:
docs/
: Contains documentation/examples for each function.
ex/
: Contains data files used in examples.
src/
: Contains the codebase.
The tools in this package operate on symbols of type statData_t
(uint8_t
by default) or on uint32_t
unless otherwise specified.
One can make all the binaries using:
Several Makefile
s are provided; these are useful in various contexts which are hopefully reasonably clear from the name.
Below is a summary of available Theseus functions. Detailed documentation for each function can be found in the docs/
folder and links are provided.
Executables that Implement Portions of SP 800-90B Testing
Function Name
Description
non-iid-main
An implementation of the non-IID SP 800-90B estimators.
restart-transpose
Calculate the transpose of the provided restart data matrix.
restart-sanity
Perform the restart test for the provided restart data.
permtests
Perform the permutation IID tests on the provided data.
chisquare
Perform the chi square IID tests on the supplied data.
lrs-test
Perform the LRS IID test on the supplied data.
Markov
Run some variant of the SP 800-90B 2016 Markov test.
H_submitter Production Utilities
Function Name
Description
ro-model
Produce a min entropy estimate using the selected stochastic model.
linear-interpolate
Takes a set of points (x_1
, y_1
), ... (x_n
, y_n
) and treats the points as a relation. This tool can be used to infer parameters from the statistical results.
Test Data Production Utilities
Function Name
Description
randomfile
Create a random data file for use in testing.
simulate-osc
Create a random data file for use in testing that simulates a ring oscillator.
Result Interpretation Utilities
Function Name
Description
percentile
Takes a set of human-readable doubles and provides the pth percentile. Percentile is estimated using the recommended NIST method Hyndman and Fan's R6 .
mean
Takes a set of human-readable doubles and provides the mean.
failrate
Takes a set of human-readable doubles and provides the proportion that are less than or equal to <p>
or greater than or equal to <q>
. This is useful to characterize false positive rates for statistical tests with inclusive failure bounds.
Data Conversion Utilities
Functions for word size and base input conversion:
Function Name
Description
u8-to-u32
Converts provided binary data from type uint8_t to type uint32_t.
u16-to-u32
Converts provided binary data from type uint16_t to type uint32_t.
u64-to-u32
Converts provided binary data from type uint64_t to type uint32_t.
dec-to-u32
Converts provided human-readable decimal values to binary data. (Note this is the opposite of u32-to-ascii.)
dec-to-u64
Converts provided human-readable decimal values to binary data. (Note this is the opposite of u64-to-ascii.)
hex-to-u32
Converts provided human-readable hexidecimal values to binary data.
Functions related to deltas, counters, endianness, and bit width:
Function Name
Description
u32-delta
Extracts deltas and then translates the result to positive values.
u64-jent-to-delta
Converts provided binary data from uint64_t deltas in jent format (JEnt version 3.0.1 and earlier) to uint64_t deltas in nanoseconds format. Also guesses native byte order and swaps if necessary. Jent format expects the upper 32 bits to contain seconds and the lower 32 bits to contain nanoseconds.
u32-counter-raw
Extracts deltas treated as 32-bit unsigned counters (they may roll over).
u64-counter-raw
Extracts deltas treated as 64-bit unsigned counters (they may roll over).
u32-counter-endian
Trys to guess counter endianness and translates to the local platform.
u64-counter-endian
Trys to guess counter endianness and translates to the local platform.
u64-change-endianness
Changes between big and little endian byte-ordering conventions.
u32-counter-bitwidth
Extracts deltas under the assumption that the data is an increasing counter of some inferred bitwidth.
u32-expand-bitwidth
Extracts inferred values under the assumption that the data is a truncation of some sampled value, whose bitwidth is inferred.
Functions for converting to type statData_t for statistical analysis:
Function Name
Description
u8-to-sd
Converts provided binary data from type uint8_t to type statData_t.
u16-to-sdbin
Converts provided binary data from type uint16_t to type statData_t by expanding packed bits.
u32-to-sd
Converts provided binary data from type uint32_t to type statData_t.
blocks-to-sdbin
Extracts bits from a blocksize-sized block one byte at a time, in the specified byte ordering.
Functions for output conversion (for histograms, human readability, categorization, etc.):
Function Name
Description
sd-to-hex
Converts provided binary data to human-readable hexidecimal values.
u32-to-ascii
Converts provided binary data to human-readable decimal values. (Note this is the opposite of dec-to-u32.)
u64-to-ascii
Converts provided binary data to human-readable decimal values. (Note this is the opposite of dec-to-u64.)
u32-to-categorical
Produces categorical summary of provided binary data.
Functions to bin and group data:
Function Name
Description
downsample
Groups data by index into modular classes mod <rate>
evenly into the <block size>
.
highbin
Attempts to bin input symbols into 2^<outputBits>
bins with equal numbers of adjacent samples.
u32-downsample
Groups data by index into modular classes mod <rate>
evenly into the <block size>
.
u32-manbin
Assign given binary data to one of the n bin numbers (0, ..., n-1).
Functions to select and extract data:
Function Name
Description
extractbits
Takes the given binary data and extracts bits with <bitmask>
.
selectbits
Identify the bit selections that are likely to contain the most entropy, up to <outputBits>
bits wide.
u32-bit-select
Selects and returns the value in the given bit position (0 is the LSB, 31 is the MSB).
u128-bit-select
Selects and returns the value in the given bit position (0 is the LSB, 127 is the MSB, little endian is assumed).
u32-selectdata
Attempt to keep the percentages noted in the provided binary data.
u32-selectrange
Extracts all values from the given binary data between a specified low
and high
(inclusive).
Functions to translate data:
Function Name
Description
translate-data
Perform an order-preserving map to arrange the input symbols to (0, ..., k-1).
u32-translate-data
Perform an order-preserving map to arrange the input symbols to (0, ..., k-1).
Functions to find and process fixed data:
Function Name
Description
bits-in-use
Determines the number of bits required to represent the given data after removing stuck and superfluous bits.
discard-fixed-bits
Takes provided binary data and returns it with fixed bits discarded. Non-fixed bits are moved to the LSB of the output.
u32-discard-fixed-bits
Takes provided binary data and returns it with fixed bits discarded. Non-fixed bits are moved to the LSB of the output.
u128-discard-fixed-bits
Takes provided binary data and returns it with fixed bits discarded. Non-fixed bits are moved to the LSB of the output.
Functions to merge, sort, and permute data:
Function Name
Description
double-merge
Merges two sorted lists of doubles into a single merged sorted list of doubles.
double-sort
Takes doubles from the file and sorts them.
u32-bit-permute
Permute bits within the given binary data as specified in <bit specification>
. Bit ordering is specified in the LSB 0 format (i.e., bit 0 is the LSB and bit 31 is the MSB).
Functions with mathematical operations on data:
Function Name
Description
double-minmaxdelta
Takes a set of human-readable doubles and provides the mean.
hweight
Calculates the Hamming weight of <bitmask>
. As an example, the bit string 11101000 has a Hamming weight of 4.
u16-mcv
Finds the most common value in the given binary data.
u32-anddata
Takes the given binary data and bitwise ANDs each symbol with <bitmask>
.
u32-gcd
Finds common divisors and removes these factors from the given binary data.
u32-mcv
Finds the most common value in the given binary data.
u32-regress-to-mean
Calculate the mean, force each k
-block to have this mean, and then round the resulting values.
u32-xor-diff
Produces the running XOR of adjacent values in provided binary data.
For more information on the estimation methods, see SP 800-90B .