The GRIM test

An implementation of the GRIM test, in python

Introduction

This package is based on the GRIM (Granularity-Related Inconsistency of Means) test first highlighted by Heathers & Brown in their 2016 paper.

The test makes use of a simple numerical property to identify if the mean of integer values has been correctly calculated.

You don't need the original integer values. You just need the mean and the number (n) of items.

What about rounding?

Often the mean you are testing has previously been rounded. You can check if the mean is consistent with a particular rounding type by including that as an argument.

This implementation supports all the rounding types found in the Python decimal implementation (at least between versions 3.8 and 3.11).

(They are: ROUND_CEILING, ROUND_DOWN, ROUND_FLOOR, ROUND_HALF_DOWN, ROUND_HALF_EVEN, ROUND_HALF_UP, ROUND_UP, ROUND_05UP)

If no rounding type is included then the test assumes ROUND_HALF_UP.

These examples are available as a Google Colab Notebook

How do I install it?

On the command line:

pip install grim

In a google Colab/iPython/Jupyter notebook:

!pip install grim

Example: Is this mean, n and rounding type consistent?

from grim import mean_tester
import decimal

# mean is 11.09 and n is 21
print(mean_tester.consistency_check('11.09', '21', decimal.ROUND_HALF_UP))

This will return False as the mean could not be correct given a list of 21 integers (and using ROUND_HALF_UP rounding.)

Example: Is this mean & n consistent using any rounding type?

from grim import mean_tester
import decimal

# mean is 11.09 and n is 21
print(mean_tester.summary_consistency_check('11.09', '21'))

This will return:

{'ROUND_CEILING': False, 'ROUND_DOWN': True, 'ROUND_FLOOR': True, 'ROUND_HALF_DOWN': False, 'ROUND_HALF_EVEN': False, 'ROUND_HALF_UP': False, 'ROUND_UP': False, 'ROUND_05UP': True}

As you can see, a given mean and n might be consistent using one form of rounding but not others.

You can pass in the numbers as Strings or Decimals, this avoids floating point accuracy issues that are more likely to occur when using a 'float'.

How do I see some logging about how the possible matches the algorithm has considered?

Add an extra argument, log_status=True.

print(mean_tester.summary_consistency_check('11.09', '21', log_status=True))

The output would look this:

Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.05, Upper match: 11.10, Match status: False, Rounding method: ROUND_CEILING
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.04, Upper match: 11.09, Match status: True, Rounding method: ROUND_DOWN
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.04, Upper match: 11.09, Match status: True, Rounding method: ROUND_FLOOR
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.05, Upper match: 11.10, Match status: False, Rounding method: ROUND_HALF_DOWN
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.05, Upper match: 11.10, Match status: False, Rounding method: ROUND_HALF_EVEN
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.05, Upper match: 11.10, Match status: False, Rounding method: ROUND_HALF_UP
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.05, Upper match: 11.10, Match status: False, Rounding method: ROUND_UP
Tue, 18 Apr 2023 18:02:00 +0000 : Target Mean: 11.09, Decimal places: 2, Lower match: 11.00, Middle match: 11.04, Upper match: 11.09, Match status: True, Rounding method: ROUND_05UP
{'ROUND_CEILING': False, 'ROUND_DOWN': True, 'ROUND_FLOOR': True, 'ROUND_HALF_DOWN': False, 'ROUND_HALF_EVEN': False, 'ROUND_HALF_UP': False, 'ROUND_UP': False, 'ROUND_05UP': True}

A warning about floating point numbers & computers:

Beware of creating Decimals from floating point numbers as these may have floating point inaccuracies.

e.g.:

import decimal

print(decimal.Decimal(1.1))
1.100000000000000088817841970012523233890533447265625

Notice how the inaccurate representation of 1.1 from the floating point number has been preserved in the Decimal. Its better to create a decimal from a String E.g.:

import decimal

print(decimal.Decimal('1.1'))
1.1

Many tools can be configured to read in text [that might be a number] as a string with out parsing. Some tools, such as Webdriver, only return a string (Which is useful!)

For more information on the origins of these issues in modern computer languages read this.

How can I find out more about the GRIM test?

James Heathers has published articles that explain how the technique works and how he used it to expose inconsistencies in scientific papers.

Citation file

There is a citation file included in the code repo.

phoughton/grim_test