arcetri/sts

Result differences between nist sts-2.1.2

terrancewong opened this issue · 3 comments

Testing official reference data.pi, 3.2.5 and 3.2.6 results the same, but does not match nist sts 2.1.2.

Platform

% gcc --version
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
. . .
 % uname -a
Linux <hostname> 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10 10:59:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Results

test 2.1.2 3.2.5
BlockFrequency 0.380615 0.584952
LongestRun 0.024390 0.027295
Universal 0.669012 0.687852

This is confusing us.

Command line and user input for 2.1.2

% cat test.pi.txt
0
data/data.pi
1
0
1
0
% ./assess 1000000 < test.pi.txt 

Command line for 3.2.x

% ./sts -i 1 -w ../326pi -F a -S 1000000 -s ../sts-2.1.2/data/data.pi

I can figure out diff in "Block Frequency test" is due to diff block length and # of substrings

< 			Block Frequency test
---
> 			BLOCK FREQUENCY TEST
3,6c3
< 		(a) Chi^2           = 58.010254
< 		(b) # of substrings = 61
< 		(c) block length    = 16384
< 		(d) bits discarded  = 576
---
> 		COMPUTATIONAL INFORMATION:
8c5,10
< SUCCESS		p_value = 0.584952
---
> 		(a) Chi^2           = 7849.375000
> 		(b) # of substrings = 7812
> 		(c) block length    = 128
> 		(d) Note: 64 bits were discarded.
> 		---------------------------------------------
> SUCCESS		p_value = 0.380615

So let's talk about the other 2.

lcn2 commented

Hello terrancewong,

First, thank you for your comments and report. We do appreciate you taking the time to comment.

At first glance we see that only 1 block of data is being evaluated:

... -i 1 ...

Our data.pl file has 1125470 ASCII characters in it, and yet you have a bitcount of 100000.
So you are processing only 1 bitcount block. To process most of the data set, try:

... -i 8 -S 112544 ...

We recommend that people use -v 3 to see what is happening.

BTW with -v 3 you will see that -S 100000 is a small bitcount and is forcing several tests to be disabled. Normally we recommend a larger bit count (such as 1000000), but sometimes you don't have the data. So given your small data set, 100000 has to be.

With -v you wills see that only 1 iteration was performed. The -v 3 output shows:

Start of iterate phase
Completed iteration 1 of 1 at 2018-11-08 10:05:17
End of iterate phase

You cannot produce anything close to reasonable stats with just one data block.

A bit better might be:

sts -v 3 -i 8 -I 1 -w /var/tmp/326pi -F a -S 112544 -s .../data.pi

You have only 8 blocks to process, and you p-value set for block frequency is tiny:

0.382544
0.242065
0.087767
0.280007
0.788745
0.728973
0.497652
0.647068

One can hardly consider a distribution of only 8 p-values (because you have only 8 bitcount blocks of data) in a 10 bin distribution (sts by default, bins p-values 10 ranges [0.0,0.1), [0.1,0.2), ... [0.9,1.0]) for a uniformity test.

At least results.txt shows:

  • The "Block Frequency" test passed both the analyses.

And BlockFrequency/stats.txt shows:

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 6.374756
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.382544

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 7.946289
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.242065

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 11.019775
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.087767

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 7.464600
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.280007

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 3.158203
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.788745

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 3.612305
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.728973

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 5.367188
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.497652

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 4.218994
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.647068

Hi lcn2,

Really appreciate for the reply.

Sorry I did not mention I was verifying "empirical" numbers in SP800.22(rev1a) appendix B.

And, we'll try look deeper into -v details and some tiny diffs in parameters.

lcn2 commented

Hello terrancewong,

We ran sts v3.2.6 on a larger set of Pi. See the google drive directory:

https://drive.google.com/open?id=1pPtwskyB5qFEAUzGp5-jzb8lpgdjzFfw

The file, pi_1b-binary.u8, contains 4 billion (4e9) raw binary bits of Pi after the decimal point:

https://drive.google.com/open?id=1htFeIC9gx9Zmn7OWku8ABv7QRCDxWilS

The run script (sts.run.sh) drives sts to process 3999268864 bits of data (499908608 bytes):

https://drive.google.com/open?id=1MApuPW824WD34o6lBNbY4ifDEkK5kkHJ

produces results (result.txt) that show that, as expected, Pi is random to
at least an alpha confidence level of 0.01:

https://drive.google.com/open?id=1C-3lrJjtoMn7a3J3DKYVZVspHWgfq0uo

See the README.txt file in that directory for details:

https://drive.google.com/open?id=1hFDGDOiaQZX6KqypsNDxZMbXOCfOKBHz

Based on this extensive test, we believe that sts v3.2.6 is analyzing Pi correctly.

We hope this helps.