stricaud/faup

Valgrind error with some urls using the lib (tld-tree.c:295)

Closed this issue · 5 comments

I process data from DNS records and found an issue when processing some urls. Valgrind states an invalid read of size 1, several bytes after the end of the input url.

Program to reproduce the issue :

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <faup/faup.h>
#include <faup/decode.h>

int main(){
    const char* url = "sub.sub.sub.domain.com/";

    char* domain = malloc(strlen(url)+1);
    memset(domain, 0, strlen(url)+1);
    strcpy(domain, url);

    faup_options_t* opts = faup_options_new();
    faup_handler_t* fh = faup_init(opts);

    faup_decode(fh, domain, strlen(domain));

    free(domain);
    faup_terminate(fh);
    faup_options_free(opts);
    return 0;
}

compile with

gcc main.c -lfaupl

The valgrind output :

==116159== Memcheck, a memory error detector
==116159== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==116159== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==116159== Command: ./faup_test
==116159== 
==116159== Invalid read of size 1
==116159==    at 0x488B969: faup_tld_tree_tld_exists (tld-tree.c:295)
==116159==    by 0x488BE64: faup_tld_tree_extract (tld-tree.c:406)
==116159==    by 0x48870D2: faup_decode (decode.c:228)
==116159==    by 0x108B7A: main (in /tmp/faup_test)
==116159==  Address 0x4aad05c is 4 bytes after a block of size 24 alloc'd
==116159==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==116159==    by 0x108AD7: main (in /tmp/faup_test)
==116159== 
==116159== 
==116159== HEAP SUMMARY:
==116159==     in use at exit: 1,812 bytes in 5 blocks
==116159==   total heap usage: 61,997 allocs, 61,992 frees, 1,745,213 bytes allocated
==116159== 
==116159== LEAK SUMMARY:
==116159==    definitely lost: 0 bytes in 0 blocks
==116159==    indirectly lost: 0 bytes in 0 blocks
==116159==      possibly lost: 0 bytes in 0 blocks
==116159==    still reachable: 1,812 bytes in 5 blocks
==116159==         suppressed: 0 bytes in 0 blocks
==116159== Reachable blocks (those to which a pointer was found) are not shown.
==116159== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==116159== 
==116159== For lists of detected and suppressed errors, rerun with: -s
==116159== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

It might be linked to the issue #106 .
It seems to happen when there are several subdomains and a path after the tld (the issue disappears when removing the / at the end).
I cannot reproduce the issue with the command line
I reproduce the issue on Linux 5.15. Ubuntu 20.04.1.

Thank you

Thanks for the report, I can reproduce this. I can see however there is no leak:

==86171== All heap blocks were freed -- no leaks are possible

Yet, I do not want to see this so I will have a look. Thanks!

I took a look on this issue again and managed to narrow down the issue. It seems to me that the error is located on the line 402 of the file tld_tree.c : To verify if the TLD is an exception, it is passed on Line 406 with a length that should not be this big.
With added printf in the source I can see those variables :

tld_extract: counter: 11, org_str_len: 23, host.pos: 0, host.size: 22, last_len_just_for_host 10

The function faup_tld_tree_tld_exists then tries to access tld + tld_len-1 which is outside the original buffer.

I can't go any further really as I really don't understand the calculation of the line 402.

After some more digging, I found what seem to trigger the issue :
It happens when the subdomains are at least twice as long as the TLD, I can't understand why but the issue seems to come from the "last_len" (or counter) which was change a few years ago (see here).

I tried to revert it : last_len = counter; => last_len = strlen(last); and it seems to have fixed the issue for us. don't hesitate if you want us to post the solution as a PR.

Sorry for the delay. Please send this as a PR so we can run the tests and merge, or track from there if we missed something. Thanks!

Fixed on Master, I'm closing my ticket