mbedtls_mpi_mod_int() produces incorrect results

Summary

mbedtls_mpi_mod_int produces the wrong result for certain inputs (initially found on 32-bit x86)

System information

Mbed TLS version (number or commit id): 49e9fbd
Operating system and version: Linux x64
Configuration (if not default, please attach mbedtls_config.h):
Compiler and options (if you used a pre-built binary, please indicate how you obtained it): gcc and clang with -m32
Additional environment information:

Expected behavior

Print 3644370

Actual behavior

Print 650989178

Steps to reproduce

export CFLAGS="-m32"
git clone --depth 1 -b development https://github.com/ARMmbed/mbedtls.git
cd mbedtls/
mkdir build/
cd build/
cmake .. -DENABLE_PROGRAMS=0 -DENABLE_TESTING=0
make -j$(nproc)

Then compile and run:

#include <mbedtls/bignum.h>

#define CF_CHECK_EQ(expr, res) if ( (expr) != (res) ) { goto end; }

int main(void)
{
    mbedtls_mpi a;
    mbedtls_mpi_uint ret;
    mbedtls_mpi_init(&a);
    CF_CHECK_EQ(mbedtls_mpi_read_string(&a, 10, "230772460340063000000100500000300000010"), 0);
    CF_CHECK_EQ(mbedtls_mpi_mod_int(&ret, &a, 1205652040), 0);
    printf("%u\n", ret);
end:
    mbedtls_mpi_free(&a);
    return 0;
}

Additional information

I'm interested in this issue and have some findings here. Please correct me if I'm wrong. The problem is the algorithm for general case:

mbedtls/library/bignum.c

Lines 1476 to 1490 in faefe62

    
               /* 
        
                * general case 
        
                */ 
        
               for( i = A->n, y = 0; i > 0; i-- ) 
        
               { 
        
                   x  = A->p[i - 1]; 
        
                   y  = ( y << biH ) | ( x >> biH ); 
        
                   z  = y / b; 
        
                   y -= z * b; 
        
                   x <<= biH; 
        
                   y  = ( y << biH ) | ( x >> biH ); 
        
                   z  = y / b; 
        
                   y -= z * b; 
        
               }

From my limited math knowledge, the algorithm is correct but we might have overflow when doing y << biH. Because we couldn't ensure that y < (1 << biH), when b >= (1 << biH). y is only guaranteed to be less than b.
So the possible solution would be, if it is acceptable, only do calculation when b is less than 1 << biH and return error otherwise.

only do calculation when b is less than 1 << biH and return error otherwise.

Unfortunately this is not acceptable: this is a public function, we can't remove functionality from it.

Can you confirm experimentally (with a debugger or by adding an assert or a printf) that the failing test case causes an overflow at this point?

library patch:

diff --git a/library/bignum.c b/library/bignum.c
index d33f07cc4..b2b09049f 100644
--- a/library/bignum.c
+++ b/library/bignum.c
@@ -1479,10 +1479,18 @@ int mbedtls_mpi_mod_int( mbedtls_mpi_uint *r, const mbedtls_mpi *A, mbedtls_mpi_
     for( i = A->n, y = 0; i > 0; i-- )
     {
         x  = A->p[i - 1];
+        if ( y >> biH )
+            mbedtls_printf(
+                    "[bignum.c:%d] y << biH would overflow, y = %08x\n",
+                    __LINE__, y);
         y  = ( y << biH ) | ( x >> biH );
         z  = y / b;
         y -= z * b;
 
+        if ( y >> biH )
+            mbedtls_printf(
+                    "[bignum.c:%d] y << biH would overflow, y = %08x\n",
+                    __LINE__, y);
         x <<= biH;
         y  = ( y << biH ) | ( x >> biH );
         z  = y / b;

Test code:

#define MBEDTLS_ALLOW_PRIVATE_ACCESS
#include <mbedtls/bignum.h>

#define CF_CHECK_EQ(expr, res) if ( (expr) != (res) ) { goto end; }

char* A[] = {
    "562954248388609",
    "562954248388610",
    "2417870085973332058963968",
    "230772460340063000000100500000300000010",
    "562954248388609",
    "562954248388610",
    "2417870085973332058963968",
    "230772460340063000000100500000300000010",
};

mbedtls_mpi_sint B[] = {
    65537,
    65537,
    65537,
    1205652040,
    65535,
    65535,
    65535,
    65535,
};

#define TEST_NUM ( sizeof(A) / sizeof(A[0]) )

int main(void)
{
    int i;
    mbedtls_mpi a,b;
    mbedtls_mpi r;
    mbedtls_mpi_uint ret;
    mbedtls_mpi_uint p[1];

    for ( i = 0; i < TEST_NUM; i++ )
    {
        printf("\n===== test %d =====\n", i);
        printf("Calculate A mod B\n");
        printf("A = %s\n", A[i]);
        printf("B = %d\n", B[i]);


        mbedtls_mpi_init(&a);
        mbedtls_mpi_init(&b);
        mbedtls_mpi_init(&r);

        CF_CHECK_EQ(mbedtls_mpi_read_string(&a, 10, A[i]), 0);
        p[0] = ( B[i] < 0 ) ? -B[i] : B[i];
        b.s = ( B[i] < 0 ) ? -1 : 1;
        b.n = 1;
        b.p = p;

        CF_CHECK_EQ(mbedtls_mpi_mod_int(&ret, &a, B[i]), 0);
        CF_CHECK_EQ(mbedtls_mpi_mod_mpi(&r, &a, &b), 0);
        printf("mbedtls_mpi_mod_int: %u\n", ret);
        printf("mbedtls_mpi_mod_mpi: %u\n", r.p[0]);
    }
end:
    mbedtls_mpi_free(&a);
    return 0;
}

Output:

===== test 0 =====
Calculate A mod B
A = 562954248388609
B = 65537
[bignum.c:1485] y << biH would overflow, y = 00010000
mbedtls_mpi_mod_int: 1
mbedtls_mpi_mod_mpi: 0

===== test 1 =====
Calculate A mod B
A = 562954248388610
B = 65537
[bignum.c:1485] y << biH would overflow, y = 00010000
mbedtls_mpi_mod_int: 2
mbedtls_mpi_mod_mpi: 1

===== test 2 =====
Calculate A mod B
A = 2417870085973332058963968
B = 65537
[bignum.c:1485] y << biH would overflow, y = 00010000
mbedtls_mpi_mod_int: 0
mbedtls_mpi_mod_mpi: 65536

===== test 3 =====
Calculate A mod B
A = 230772460340063000000100500000300000010
B = 1205652040
[bignum.c:1485] y << biH would overflow, y = 1de3942f
[bignum.c:1493] y << biH would overflow, y = 0475d7be
[bignum.c:1485] y << biH would overflow, y = 00283a25
[bignum.c:1493] y << biH would overflow, y = 3a25f156
[bignum.c:1485] y << biH would overflow, y = 19c06031
[bignum.c:1493] y << biH would overflow, y = 1854b686
mbedtls_mpi_mod_int: 650989178
mbedtls_mpi_mod_mpi: 3644370

===== test 4 =====
Calculate A mod B
A = 562954248388609
B = 65535
mbedtls_mpi_mod_int: 4
mbedtls_mpi_mod_mpi: 4

===== test 5 =====
Calculate A mod B
A = 562954248388610
B = 65535
mbedtls_mpi_mod_int: 5
mbedtls_mpi_mod_mpi: 5

===== test 6 =====
Calculate A mod B
A = 2417870085973332058963968
B = 65535
mbedtls_mpi_mod_int: 3
mbedtls_mpi_mod_mpi: 3

===== test 7 =====
Calculate A mod B
A = 230772460340063000000100500000300000010
B = 65535
mbedtls_mpi_mod_int: 61410
mbedtls_mpi_mod_mpi: 61410

Since the algorithm is incorrect, we would expect there to be values for which it fails on 64-bit systems, and indeed mbedtls_mpi_mod_int(230772460340063000000100500000300000010, 5178236083361335880) gives 3644370 not 2099774298.

The following test cases include both 32-bit and 64-bit trigger values, and also use the mpi_mod_mpi tests to show the expected values are correct. On 32-bit systems (only) the third (second mpi_mod_int) case fails; on 64-bit systems the first case fails.

(However, there is a gotcha here: since the test interface gives int not long, on an LP64 system - rather than ILP64 - it's not possible to correctly express the 64-bit mbedtls_mpi_mod_int() test)

Test mbedtls_mpi_mod_int: 230772460340063000000100500000300000010 % 5178236083361335880 -> 3386266129388798810
depends_on:MBEDTLS_HAVE_INT64
mpi_mod_int:"AD9D28BF6C4E98FDF156BF0980CEE30A":5178236083361335880:3386266129388798810:0

Test mbedtls_mpi_mod_mpi: 230772460340063000000100500000300000010 % 5178236083361335880 -> 3386266129388798810
mpi_mod_mpi:"AD9D28BF6C4E98FDF156BF0980CEE30A":"47DCCA4847DCCA48":"2EFE6F1A7D28035A":0

Test mbedtls_mpi_mod_int: 230772460340063000000100500000300000010 % 1205652040 -> 3644370
mpi_mod_int:"AD9D28BF6C4E98FDF156BF0980CEE30A":1205652040:3644370:0

Test mbedtls_mpi_mod_mpi: 230772460340063000000100500000300000010 % 1205652040 -> 3644370
mpi_mod_mpi:"AD9D28BF6C4E98FDF156BF0980CEE30A":"47DCCA48":"379BD2":0

My fuzzer missed the 64 bit case because it was always using int32 instead of mbedtls_mpi_sint. Fixed it and now it finds it immediately.

guidovranken/cryptofuzz@42dcf77

I've created a couple of PRs to add these test cases (commented out) #6573 for development and the backport #6575

	/*
	* general case
	*/
	for( i = A->n, y = 0; i > 0; i-- )
	{
	x = A->p[i - 1];
	y = ( y << biH ) \| ( x >> biH );
	z = y / b;
	y -= z * b;

	x <<= biH;
	y = ( y << biH ) \| ( x >> biH );
	z = y / b;
	y -= z * b;
	}