vmware/splinterdb

Couple of incorrect usages of `platform_heap_id hid` in BTree unit-tests: Could potentially lead to ASAN failures.

gapisback opened this issue · 2 comments

Describe the bug

During some prototype development of other features that exercise platform_heap_id, discovered following two instances of code snippets that are potentially flawed.

  1. unit/btree_test.c: In leaf_split_tests(), use of scratch arg as hid parameter to btree_leaf_incorporate_tuple() is potentially flawed. Way deep down in writable buffer code, we mistakenly think that this 'scratch' space handle is the heap-handle. And code flows into attempts to free memory from this handle. Currently, nothing is majorly wrong, but instrumentation to track allocs / frees are tripping up messages, indicating that this call-site is the source of the aberrant usage.
416       bool success = btree_leaf_incorporate_tuple(
417          cfg, scratch, hdr, key, bigger_msg, &spec, &generation);
418       if (success) {
  1. unit/btree_stress_test.c: In this test, the insert_thread_params{} struct has two instances of heap_id variables:
 29 typedef struct insert_thread_params {
 30    cache           *cc;
 31    btree_config    *cfg;
 32    platform_heap_id heap_id;
[...]
 38    platform_heap_id hid;
 39 } insert_thread_params;

This causes code to be confused as to which one to use. E.g., here in insert_tests() function we mistakenly use both handles; one to alloc and one to free.

312    uint8 *keybuf      = TYPED_MANUAL_MALLOC(hid, keybuf, keybuf_size);
313    uint8 *msgbuf      = TYPED_MANUAL_MALLOC(hid, msgbuf, msgbuf_size);
314
[...]
330    platform_free(heap_id, keybuf);
331    platform_free(heap_id, msgbuf);

Currently it's not a real issue as both heap_ids are NULL, so we pretty much end up using malloc and free.

In some future prototype work, this mixed-up usages of platform_heap_id is causing ASAN runs to flag memory allocation errors.

Reproduction steps

  1. Nothing can be seen in /main with ASAN runs. These are latent issues which will surface if we ever change heap-memory handling.

...

Expected behavior

  1. Potential fix is to use hid rather than scratch when invoking said function.
  2. Potential fix is to purge heap_id variable from insert_thread_params{} and consistently use only hid variable.

Additional context

No response

@rtjohnso - I logged this issue; fyi. #1 has also been posted on internal slack thread, here.
The 2nd one, I stumbled upon it while running ASAN-build tests in my dev code to implement shared memory support.

I have patched fixes for these in my dev-branch . Would you have time to fix these in /main? If yes, I'll let you take a crack at it. If not, I'll get to this after I get some control on my main task of insert perf benchmarking. (These are minor irritant blockers for that dev-work, but I've patched-fixed them in my dev-branch to keep my progress moving along.)

Fixed by this commit to /main:

a49f4a7 gapisback 32 hours ago Tue, 08-Nov-2022, 06:28:59 AM (Authored: Tue, 08-Nov-2022, 06:28:59 AM)
 (#463) Fix incorrect usages of heap-ID in BTree unit tests. (#472)