seppo0010/rlite

Heap buffer overflow while reading (possibly) corrupted database file.

mannol opened this issue · 8 comments

In short: my app crashed once on unrelated place, don't know what happened with the database file, but ever since then It crashes while reading a certain key.

Steps to reproduce:

  1. Use the attached file (uncompressed :) dynamic.db.zip

  2. Use the following code:

// ... open dynamic.db
rliteCommand(db, "ZREVRANGEBYLEX notifs + -");

The ASAN output:

=================================================================
==12178==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61900000098f at pc 0x7ff602937935 bp 0x7fff315f7d40 sp 0x7fff315f74e8
READ of size 20 at 0x61900000098f thread T0
    #0 0x7ff602937934 in __asan_memcpy (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x8c934)
    #1 0x60fbbf in rl_btree_node_deserialize_hash_sha1_key ../src/page_btree.c:995
    #2 0x5e3a89 in rl_read ../src/rlite_internal.c:658
    #3 0x60bd00 in rl_btree_find_score ../src/page_btree.c:187
    #4 0x5c8082 in rl_key_get_hash_ignore_expire ../src/page_key.c:92
    #5 0x5c81dd in rl_key_get_ignore_expire ../src/page_key.c:126
    #6 0x5c84a2 in rl_key_get ../src/page_key.c:193
    #7 0x5dcc89 in rl_zset_get_objects ../src/type_zset.c:114
    #8 0x5de32d in rl_zrevrangebylex ../src/type_zset.c:514
    #9 0x5b6e47 in genericZrangebylexCommand ../src/hirlite.c:1551
    #10 0x5b6ee2 in zrevrangebylexCommand ../src/hirlite.c:1564
    #11 0x5b4812 in rliteAppendCommandClient ../src/hirlite.c:1086
    #12 0x5b49cb in rlitevAppendCommand ../src/hirlite.c:1119
    #13 0x5b4b9e in rlitevCommand ../src/hirlite.c:1150

0x61900000098f is located 15 bytes to the right of 1024-byte region [0x619000000580,0x619000000980)
allocated by thread T0 here:
    #0 0x7ff602943602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x5e381b in rl_read ../src/rlite_internal.c:617
    #2 0x60bd00 in rl_btree_find_score ../src/page_btree.c:187
    #3 0x5c8082 in rl_key_get_hash_ignore_expire ../src/page_key.c:92
    #4 0x5c81dd in rl_key_get_ignore_expire ../src/page_key.c:126
    #5 0x5c84a2 in rl_key_get ../src/page_key.c:193
    #6 0x5dcc89 in rl_zset_get_objects ../src/type_zset.c:114
    #7 0x5de32d in rl_zrevrangebylex ../src/type_zset.c:514
    #8 0x5b6e47 in genericZrangebylexCommand ../src/hirlite.c:1551
    #9 0x5b6ee2 in zrevrangebylexCommand ../src/hirlite.c:1564
    #10 0x5b4812 in rliteAppendCommandClient ../src/hirlite.c:1086
    #11 0x5b49cb in rlitevAppendCommand ../src/hirlite.c:1119
    #12 0x5b4b9e in rlitevCommand ../src/hirlite.c:1150

SUMMARY: AddressSanitizer: heap-buffer-overflow ??:0 __asan_memcpy
Shadow bytes around the buggy address:
  0x0c327fff80e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff80f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c327fff8130: fa[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8150: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fff8160: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fff8170: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fff8180: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe

I thought about adding per-page checksum to fail gracefully when the file is corrupted, but I never implemented it. In this case, the number of elements to read is part of the file, and the code assume it is correct, so if the file is corrupted then it will fail as stated.

Without a checksum, it is too cumbersome to validate each value. Adding checksum is possible, but I don't think I'll do that in the short term.

So how would I know if the file is corrupt and is it possible to handle that case?

I'm sorry I cannot provide a good answer. The attached file looks corrupt as I cannot find any sha1 in it.

Is it possible to deal with this heap overflow so it at least doesn't crash?

EDIT:
I just checked, running any command causes the same issue. So, if it didn't crash but rather failed gracefully, I could run a basic command and know if the database is still valid. In example:

rliteCommand(db, "SET x y"); // would return NULL 

The main key tree looks broken, so any command that uses a key will fail this way. I cannot figure out how to fail gracefully in this situation without checksums.

I'm sorry, I'm not very familiar with rlite internals and such, is there a way to know if the main key tree is broken?

EDIT:
Basically, It's a real-life scenario where database files might get corrupted. It's a deal-breaker if the whole process crashes in this case. Are you saying there is no way this can be detected in any way?

There is no way to know if the file is corrupted, no.

A process crash should not corrupt the file as before writing any change it is written in a separated wal file and then into the main file. If the process crashes before fully writing the wal file, the transaction is aborted; if the process crashes after writing the wal file, the transaction is recovered from the wal file.

I understand this is a deal-breaker.

Oh, okay. So, basically, a process crash is no reason the file would be corrupted, no? In that case it's fine. Thank you for your time.