baresip/re

aes_authenticate valgrind warning "uninitialised value(s)"

cspiel1 opened this issue · 15 comments

test 2 -- test_aes_gcm
==1079617== Conditional jump or move depends on uninitialised value(s)
==1079617==    at 0x4F8F234: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==1079617==    by 0x4F8F511: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==1079617==    by 0x4E900F4: EVP_DecryptFinal_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==1079617==    by 0x48F0F12: aes_authenticate (aes.c:257)
==1079617==    by 0x11E303: test_aes_gcm (aes.c:342)
==1079617==    by 0x161D2B: test_unit (test.c:492)
==1079617==    by 0x1625CC: test_reg (test.c:710)
==1079617==    by 0x13D63D: main (main.c:233)
$ openssl version
OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)

Maybe an openssl issue?

fedora@fedora36 retest]$ valgrind --leak-check=full --track-origins=yes ./retest -v -r test_aes_gcm
==20916== Memcheck, a memory error detector
==20916== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==20916== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==20916== Command: ./retest -v -r test_aes_gcm
==20916== 
single testcase: test_aes_gcm
retest: libre version 2.9.0-dev (x86_64/Linux)
using async polling method 'epoll'
using datapath './data'
regular tests:       ==20916== Conditional jump or move depends on uninitialised value(s)
==20916==    at 0x4C4FE84: gcm_cipher_internal (ciphercommon_gcm.c:425)
==20916==    by 0x4C50121: ossl_gcm_stream_final (ciphercommon_gcm.c:327)
==20916==    by 0x4B56E9C: EVP_DecryptFinal_ex (evp_enc.c:917)
==20916==    by 0x48F3DB7: aes_authenticate (in /home/fedora/git/re/libre.so.10.9.0)
==20916==    by 0x40EF63: test_aes_gcm (aes.c:342)
==20916==    by 0x448CCE: test_unit (test.c:477)
==20916==    by 0x44948A: test_reg (test.c:711)
==20916==    by 0x429DE3: main (main.c:233)
==20916== 
OK
==20916== 
==20916== HEAP SUMMARY:
==20916==     in use at exit: 0 bytes in 0 blocks
==20916==   total heap usage: 7,118 allocs, 7,118 frees, 602,826 bytes allocated
==20916== 
==20916== All heap blocks were freed -- no leaks are possible
==20916== 
==20916== For lists of detected and suppressed errors, rerun with: -s
==20916== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from 0)

this could be a bug in re and/or openssl

@robert-scheck any advice for how to report this to Fedora ?

Could you let me understand how this is specific to Fedora? /usr/lib/x86_64-linux-gnu/libcrypto.so.3 looks like Debian, not Fedora (thus the issue seems to exist in both, Debian and Fedora?). Regular way to report at Fedora is Red Hat Bugzilla. I'm happy to take care of this, if I understood how it's specific to Fedora, because otherwise I might get pointed to OpenSSL and/or re upstream.

I am on Ubuntu 22.04 LTS

https://github.com/baresip/re/actions/runs/3213367854/jobs/5252982891

==5425== Conditional jump or move depends on uninitialised value(s)
==5425==    at 0x4F99234: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==5425==    by 0x4F99511: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==5425==    by 0x4E9A0F4: EVP_DecryptFinal_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==5425==    by 0x490395F: aes_authenticate (in /home/runner/work/re/re/build/libre.so.10.9.0)
==5425==    by 0x11E323: test_aes_gcm (aes.c:342)
==5425==    by 0x16217D: test_unit (test.c:493)
==5425==    by 0x162A1E: test_reg (test.c:711)
==5425==    by 0x13DB1E: main (main.c:233)

The bug is present on both Ubuntu 22.04 and Fedora 36.

It might be related to OpenSSL version 3.0.x

We have been running the same test with valgrind and OpenSSL 1.1.1 for a long time with zero warnings.

sorry @robert-scheck it was not specific to Fedora :)

Maybe something needs to be initialized different in OpenSSL v3, but nothing that jumps out at me. If we are sure this is a OpenSSL upstream bug, we can try to suppress the error with a valgrind --suppressions=<filename> rule . And maybe open a issue.

The issue can be reproduced with Debian 11 and OpenSSL 3.0.5

alfredh@debian:~/tmp/retest$ valgrind --track-origins=yes ./retest -v -r test_aes_gcm
==283064== Memcheck, a memory error detector
==283064== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==283064== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info
==283064== Command: ./retest -v -r test_aes_gcm
==283064== 
single testcase: test_aes_gcm
retest: libre version 2.9.0-dev (x86_64/Linux)
using async polling method 'epoll'
using datapath './data'
regular tests:       ==283064== Conditional jump or move depends on uninitialised value(s)
==283064==    at 0x5058AD5: gcm_cipher_internal (ciphercommon_gcm.c:425)
==283064==    by 0x5058AD5: ossl_gcm_stream_final (ciphercommon_gcm.c:327)
==283064==    by 0x4F1619E: EVP_DecryptFinal_ex (evp_enc.c:917)
==283064==    by 0x48E7142: aes_authenticate (in /home/alfredh/tmp/re/libre.so.10.9.0)
==283064==    by 0x11BF00: test_aes_gcm (aes.c:342)
==283064==    by 0x1587E9: test_unit (test.c:477)
==283064==    by 0x158FF3: test_reg (test.c:711)
==283064==    by 0x13827F: main (main.c:233)
==283064== 

OS: Debian 11
GCC: gcc (Debian 10.2.1-6) 10.2.1 20210110

When openssl is compiled with default options, the warning is present.

If openssl is compiled with debug mode "./config -d" then there are no warnings.

openssl code:

    } else {
        /* The tag must be set before actually decrypting data */
        if (!ctx->enc && ctx->taglen == UNINITIALISED_SIZET)
            goto err;
        if (!hw->cipherfinal(ctx, ctx->buf))
            goto err;
        ctx->iv_state = IV_STATE_FINISHED; /* Don't reuse the IV */
        goto finish;
    }

Looks like its fixed in openssl 3.0.7-dev.

@cspiel1 could you please try again with openssl from git HEAD or 3.0.7-dev ?

If the bug is in the OpenSSL code, we can close this Issue.

It is still there with OpenSSL 3.2.0-dev (Library: OpenSSL 3.2.0-dev ):

$ valgrind ./retest 
==821891== Memcheck, a memory error detector
==821891== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==821891== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==821891== Command: ./retest
==821891== 
retest: libre version 2.9.0 (x86_64/linux)
using async polling method 'epoll'
regular tests:       ==821891== Conditional jump or move depends on uninitialised value(s)
==821891==    at 0x51C00B5: ossl_gcm_stream_final (in /usr/local/lib64/libcrypto.so.3)
==821891==    by 0x4FBC864: EVP_DecryptFinal_ex (in /usr/local/lib64/libcrypto.so.3)
==821891==    by 0x48F21C8: aes_authenticate (aes.c:240)
==821891==    by 0x11E383: test_aes_gcm (aes.c:342)
==821891==    by 0x1629F6: test_unit (test.c:494)
==821891==    by 0x163297: test_reg (test.c:712)
==821891==    by 0x13E397: main (main.c:233)
==821891== 
OK

The issue is most likely in openssl and not re.

We should perhaps create a valgrind suppression file.

Btw, baresip selftest has no warnings

the warning has been reported to the OpenSSL project.

the warning could be in ASM code or it could be a false positive.

it is possible to change the retest aesgcm code slightly, to avoid the warning

Do you have time to make this PR for retest to avoid this?

the testcode in aesgcm can be fixed by splitting the encryption and decryption steps into two parts.

our code is correct, and the warning is either a false positive or bug in valgrind/openssl