Result differences between nist sts-2.1.2

Question

Result differences between nist sts-2.1.2

terrancewong opened this issue 6 years ago · 3 comments

Testing official reference data.pi, 3.2.5 and 3.2.6 results the same, but does not match nist sts 2.1.2.

Platform

% gcc --version
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
. . .
 % uname -a
Linux <hostname> 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10 10:59:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Results

test	2.1.2	3.2.5
BlockFrequency	0.380615	0.584952
LongestRun	0.024390	0.027295
Universal	0.669012	0.687852

This is confusing us.

Command line and user input for 2.1.2

% cat test.pi.txt
0
data/data.pi
1
0
1
0
% ./assess 1000000 < test.pi.txt

Command line for 3.2.x

% ./sts -i 1 -w ../326pi -F a -S 1000000 -s ../sts-2.1.2/data/data.pi

I can figure out diff in "Block Frequency test" is due to diff block length and # of substrings

< 			Block Frequency test
---
> 			BLOCK FREQUENCY TEST
3,6c3
< 		(a) Chi^2           = 58.010254
< 		(b) # of substrings = 61
< 		(c) block length    = 16384
< 		(d) bits discarded  = 576
---
> 		COMPUTATIONAL INFORMATION:
8c5,10
< SUCCESS		p_value = 0.584952
---
> 		(a) Chi^2           = 7849.375000
> 		(b) # of substrings = 7812
> 		(c) block length    = 128
> 		(d) Note: 64 bits were discarded.
> 		---------------------------------------------
> SUCCESS		p_value = 0.380615

So let's talk about the other 2.

Answer 1 · 2018-11-08T18:28:13.000Z

Hello terrancewong,

First, thank you for your comments and report. We do appreciate you taking the time to comment.

At first glance we see that only 1 block of data is being evaluated:

... -i 1 ...

Our data.pl file has 1125470 ASCII characters in it, and yet you have a bitcount of 100000.
So you are processing only 1 bitcount block. To process most of the data set, try:

... -i 8 -S 112544 ...

We recommend that people use -v 3 to see what is happening.

BTW with -v 3 you will see that -S 100000 is a small bitcount and is forcing several tests to be disabled. Normally we recommend a larger bit count (such as 1000000), but sometimes you don't have the data. So given your small data set, 100000 has to be.

With -v you wills see that only 1 iteration was performed. The -v 3 output shows:

Start of iterate phase
Completed iteration 1 of 1 at 2018-11-08 10:05:17
End of iterate phase

You cannot produce anything close to reasonable stats with just one data block.

A bit better might be:

sts -v 3 -i 8 -I 1 -w /var/tmp/326pi -F a -S 112544 -s .../data.pi

You have only 8 blocks to process, and you p-value set for block frequency is tiny:

0.382544
0.242065
0.087767
0.280007
0.788745
0.728973
0.497652
0.647068

One can hardly consider a distribution of only 8 p-values (because you have only 8 bitcount blocks of data) in a 10 bin distribution (sts by default, bins p-values 10 ranges [0.0,0.1), [0.1,0.2), ... [0.9,1.0]) for a uniformity test.

At least results.txt shows:

The "Block Frequency" test passed both the analyses.

And BlockFrequency/stats.txt shows:

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 6.374756
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.382544

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 7.946289
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.242065

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 11.019775
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.087767

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 7.464600
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.280007

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 3.158203
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.788745

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 3.612305
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.728973

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 5.367188
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.497652

                    Block Frequency test
            ---------------------------------------------
            (a) Chi^2           = 4.218994
            (b) # of substrings = 6
            (c) block length    = 16384
            (d) bits discarded  = 14240
            ---------------------------------------------

SUCCESS p_value = 0.647068

Answer 2 · 2018-11-09T03:05:06.000Z

Hi lcn2,

Really appreciate for the reply.

Sorry I did not mention I was verifying "empirical" numbers in SP800.22(rev1a) appendix B.

And, we'll try look deeper into -v details and some tiny diffs in parameters.

Answer 3 · 2018-11-13T00:46:46.000Z

Hello terrancewong,

We ran sts v3.2.6 on a larger set of Pi. See the google drive directory:

https://drive.google.com/open?id=1pPtwskyB5qFEAUzGp5-jzb8lpgdjzFfw

The file, pi_1b-binary.u8, contains 4 billion (4e9) raw binary bits of Pi after the decimal point:

https://drive.google.com/open?id=1htFeIC9gx9Zmn7OWku8ABv7QRCDxWilS

The run script (sts.run.sh) drives sts to process 3999268864 bits of data (499908608 bytes):

https://drive.google.com/open?id=1MApuPW824WD34o6lBNbY4ifDEkK5kkHJ

produces results (result.txt) that show that, as expected, Pi is random to
at least an alpha confidence level of 0.01:

https://drive.google.com/open?id=1C-3lrJjtoMn7a3J3DKYVZVspHWgfq0uo

See the README.txt file in that directory for details:

https://drive.google.com/open?id=1hFDGDOiaQZX6KqypsNDxZMbXOCfOKBHz

Based on this extensive test, we believe that sts v3.2.6 is analyzing Pi correctly.

We hope this helps.