pisa-engine/pisa

Address sanitizer error in recursive graph bisection test

elshize opened this issue · 1 comments

Describe the bug

With address sanitizer on and on Clang 15, test_recursive_graph_bisection fails.

Steps To Reproduce

  1. Compile with Clang 15 (and libc++) and -DUSE_SANITIZERS=ON
  2. Run test_recursive_graph_bisection test.

Error message

=================================================================
==2937914==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fcec4c1c850 at pc 0x0000007c257a bp 0x7ffe832cf910 sp 0x7ffe832cf908
READ of size 1 at 0x7fcec4c1c850 thread T0
    #0 0x7c2579 in auto void pisa::recursive_graph_bisection<std::__1::__wrap_iter<unsigned int*>>(std::__1::vector<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>, std::__1::allocator<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>>, pisa::progress&)::'lambda'(std::__1::__wrap_iter<unsigned int*>&)::operator()<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>(std::__1::__wrap_iter<unsigned int*>&) const::'lambda'()::operator()() const /home/elshize/dev/pisa/include/pisa/recursive_graph_bisection.hpp:361:21
    #1 0x7c211c in tbb::internal::function_task<auto void pisa::recursive_graph_bisection<std::__1::__wrap_iter<unsigned int*>>(std::__1::vector<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>, std::__1::allocator<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>>, pisa::progress&)::'lambda'(std::__1::__wrap_iter<unsigned int*>&)::operator()<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>(std::__1::__wrap_iter<unsigned int*>&) const::'lambda'()>::execute() /home/elshize/dev/pisa/external/tbb/include/tbb/task.h:1059:13
    #2 0x7fcec6f63ce0 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::process_bypass_loop(tbb::internal::context_guard_helper<false>&, tbb::task*, long) /home/elshize/dev/pisa/external/tbb/./src/tbb/custom_scheduler.h:474:25
    #3 0x7fcec6f63255 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) /home/elshize/dev/pisa/external/tbb/./src/tbb/custom_scheduler.h:636:19
    #4 0x7bb6bd in tbb::task::wait_for_all() /home/elshize/dev/pisa/external/tbb/include/tbb/task.h:820:25
    #5 0x7bb6bd in tbb::internal::task_group_base::wait() /home/elshize/dev/pisa/external/tbb/include/tbb/task_group.h:168:22
    #6 0x7b9306 in void pisa::recursive_graph_bisection<std::__1::__wrap_iter<unsigned int*>>(std::__1::vector<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>, std::__1::allocator<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>>, pisa::progress&) /home/elshize/dev/pisa/include/pisa/recursive_graph_bisection.hpp:368:21
    #7 0x747c0c in pisa::detail::run_with_config(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, pisa::document_range<std::__1::__wrap_iter<unsigned int*>> const&) /home/elshize/dev/pisa/include/pisa/reorder_docids.hpp:62:9
    #8 0x65cefa in pisa::recursive_graph_bisection(pisa::RecursiveGraphBisectionOptions const&) /home/elshize/dev/pisa/include/pisa/reorder_docids.hpp:98:13
    #9 0x667ff4 in ____C_A_T_C_H____T_E_S_T____0() /home/elshize/dev/pisa/test/test_recursive_graph_bisection.cpp:120:24
    #10 0x618fa7 in Catch::TestCase::invoke() const /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:14160:15
    #11 0x618fa7 in Catch::RunContext::invokeActiveTestCase() /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:13020:27
    #12 0x6169ec in Catch::RunContext::runCurrentTest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:12993:17
    #13 0x6147be in Catch::RunContext::runTest(Catch::TestCase const&) /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:12754:13
    #14 0x620805 in Catch::(anonymous namespace)::TestGroup::execute() /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:13347:45
    #15 0x620805 in Catch::Session::runInternal() /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:13553:39
    #16 0x61e137 in Catch::Session::run() /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:13509:24
    #17 0x65c4f9 in int Catch::Session::run<char>(int, char const* const*) /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:13231:30
    #18 0x65c4f9 in main /home/elshize/dev/pisa/external/Catch2/single_include/catch2/catch.hpp:17526:29
    #19 0x7fcec6b7e50f in __libc_start_call_main (/usr/lib64/libc.so.6+0x2750f) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)
    #20 0x7fcec6b7e5c8 in __libc_start_main@GLIBC_2.2.5 (/usr/lib64/libc.so.6+0x275c8) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)
    #21 0x4a5474 in _start (/home/elshize/dev/pisa/build/test/test_recursive_graph_bisection+0x4a5474) (BuildId: 1eb26c90b12f1c027ad266dd5308cae258c09337)

Address 0x7fcec4c1c850 is located in stack of thread T0 at offset 80 in frame
    #0 0x7b8c1f in void pisa::recursive_graph_bisection<std::__1::__wrap_iter<unsigned int*>>(std::__1::vector<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>, std::__1::allocator<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>>, pisa::progress&) /home/elshize/dev/pisa/include/pisa/recursive_graph_bisection.hpp:334

  This frame has 6 object(s):
    [32, 48) 'ref.tmp.i.i.i.i'
    [64, 96) 'agg.tmp2652' <== Memory access at offset 80 is inside this variable
    [128, 136) 'ref.tmp.i.i'
    [160, 161) '__comp.i.i'
    [176, 464) 'thread_local_data' (line 335)
    [528, 792) 'level_group' (line 343)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope /home/elshize/dev/pisa/include/pisa/recursive_graph_bisection.hpp:361:21 in auto void pisa::recursive_graph_bisection<std::__1::__wrap_iter<unsigned int*>>(std::__1::vector<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>, std::__1::allocator<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>>, pisa::progress&)::'lambda'(std::__1::__wrap_iter<unsigned int*>&)::operator()<pisa::computation_node<std::__1::__wrap_iter<unsigned int*>>>(std::__1::__wrap_iter<unsigned int*>&) const::'lambda'()::operator()() const
Shadow bytes around the buggy address:
  0x0ffa5897b8b0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x0ffa5897b8c0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x0ffa5897b8d0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x0ffa5897b8e0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x0ffa5897b8f0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
=>0x0ffa5897b900: f1 f1 f1 f1 f8 f8 f2 f2 f8 f8[f8]f8 f2 f2 f2 f2
  0x0ffa5897b910: f8 f2 f2 f2 f8 f2 00 00 00 00 00 00 00 00 00 00
  0x0ffa5897b920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffa5897b930: 00 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2 f2
  0x0ffa5897b940: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffa5897b950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==2937914==ABORTING

Environment info

Operating System: Fedora 37
Compiler name: Clang
Compiler version: 15

This seems to be specific to the Network-BP version, and only happens when run in multiple threads. Which leads me to believe that the input has some intersecting ranges, but have not verified this. NVM, found the bug, will submit a fix tomorrow.