samsk/log-malloc2

GSlice: Cannot allocate memory (with _posix_memalign)

ashok3t opened this issue · 3 comments

Applications that use cv::fastMalloc to call posix_memalign(&ptr, CV_MALLOC_ALIGN, size) fail during dl_init with:
MEMORY-ERROR: [20699]: GSlice: failed to allocate 1008 bytes (alignment: 1024): Cannot allocate memory

#include <opencv2/core/core.hpp>
int main(int argc, char* argv[]) {
// cv::fastMalloc(1024);
return 0;
}

Reference: https://github.com/opencv/opencv/blob/master/modules/core/src/alloc.cpp

$ LD_PRELOAD=/usr/local/lib/liblog-malloc2.so lldb-8 -- ./WS/build/test/OcvMatcherTest
*** log-malloc trace-fd = 1022 ***
*** log-malloc trace-fd = 1022 ***
*** log-malloc trace-fd = 1022 ***

(lldb) r
...
MEMORY-ERROR: [20871]: GSlice: failed to allocate 1008 bytes (alignment: 1024): Cannot allocate memory

Process 20871 stopped

  • thread #1, name = 'OcvMatcherTest', stop reason = signal SIGABRT
    frame #0: 0x00007fffc25ca428 libc.so.6`__GI_raise(sig=6) at raise.c:54
    (lldb) bt
  • thread #1, name = 'OcvMatcherTest', stop reason = signal SIGABRT
    • frame #0: 0x00007fffc25ca428 libc.so.6__GI_raise(sig=6) at raise.c:54 frame #1: 0x00007fffc25cc02a libc.so.6__GI_abort at abort.c:89
      frame #2: 0x00007fffbf84a961 libglib-2.0.so.0___lldb_unnamed_symbol277$$libglib-2.0.so.0 + 273 frame #3: 0x00007fffbf84b406 libglib-2.0.so.0___lldb_unnamed_symbol282$$libglib-2.0.so.0 + 518
      frame #4: 0x00007fffbf84bfba libglib-2.0.so.0g_slice_alloc + 1594 frame #5: 0x00007fffbf81d74e libglib-2.0.so.0g_hash_table_new_full + 30
      frame #6: 0x00007fffbf83e94b libglib-2.0.so.0___lldb_unnamed_symbol241$$libglib-2.0.so.0 + 75 frame #7: 0x00007ffff7de76ca ld-2.23.socall_init(l=, argc=1, argv=0x00007fffffffde78, env=0x00007fffffffde88) at dl-init.c:72
      frame #8: 0x00007ffff7de77db ld-2.23.so_dl_init at dl-init.c:30 frame #9: 0x00007ffff7de77c5 ld-2.23.so_dl_init(main_map=0x00007ffff7ffe168, argc=1, argv=0x00007fffffffde78, env=0x00007fffffffde88) at dl-init.c:120
      frame #10: 0x00007ffff7dd7c6a ld-2.23.so`_dl_start_user + 50

The problem could be related to dl_init of libopencv_core.so.3.4 (walking on thin ice here). For instance, running under G_SLICE=always-malloc, my application (or the reproducer) recurses to failure in cv::fastMalloc (OpenCV 3.4.5, gcc 5.4 and CUDA 8.0 on 16.04) OR (OpenCV 3.4.5, gcc 7.3 without CUDA on 18.04).

(lldb) thread backtrace -c 11 -s 17080

  • thread #1, name = 'MidasATAdjust', stop reason = signal SIGSEGV: invalid address (fault address: 0x7fffff7feff8)
    frame #17080: 0x00007ffff6effb8a libopencv_core.so.3.4cv::fastMalloc(unsigned long) + 90 frame #17081: 0x00007ffff6e6de1d libopencv_core.so.3.4cv::String::allocate(unsigned long) + 29
    frame #17082: 0x00007ffff7021759 libopencv_core.so.3.4cv::format(char const*, ...) + 521 frame #17083: 0x00007ffff6effb8a libopencv_core.so.3.4cv::fastMalloc(unsigned long) + 90
    frame #17084: 0x00007ffff6efcb12 libopencv_core.so.3.4cvRegisterType + 434 frame #17085: 0x00007ffff6e36774 libopencv_core.so.3.4CvType::CvType(char const*, int ()(void const), void ()(void**), void ()(CvFileStorage, CvFileNode*), void ()(CvFileStorage, char const*, void const*, CvAttrList), void* ()(void const)) + 100
    frame #17086: 0x00007ffff6e367dd libopencv_core.so.3.4_GLOBAL__sub_I_persistence_types.cpp + 61 frame #17087: 0x00007ffff7de76ca ld-2.23.socall_init(l=, argc=7, argv=0x00007fffffffdcb8, env=0x00007fffffffdcf8) at dl-init.c:72
    frame #17088: 0x00007ffff7de77db ld-2.23.so_dl_init at dl-init.c:30 frame #17089: 0x00007ffff7de77c5 ld-2.23.so_dl_init(main_map=0x00007ffff7ffe168, argc=7, argv=0x00007fffffffdcb8, env=0x00007fffffffdcf8) at dl-init.c:120
    frame #17090: 0x00007ffff7dd7c6a ld-2.23.so`_dl_start_user + 50

Possibly relevant:

$ valgrind --track-origins=yes ./WS/build/test/OcvMatcherTest
==22169== Memcheck, a memory error detector
==22169== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==22169== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==22169== Command: ./WS/build/test/OcvMatcherTest
==22169==
==22169== Warning: set address range perms: large range [0x1ab57000, 0x34a30000) (defined)
==22169== Conditional jump or move depends on uninitialised value(s)
==22169== at 0x39F50177: __strstr_sse2_unaligned (strstr-sse2-unaligned.S:99)
==22169== by 0x571379EE: selinuxfs_exists (in /lib/x86_64-linux-gnu/libselinux.so.1)
==22169== by 0x57132CDB: ??? (in /lib/x86_64-linux-gnu/libselinux.so.1)
==22169== by 0x40106C9: call_init.part.0 (dl-init.c:72)
==22169== by 0x40107DA: call_init (dl-init.c:30)
==22169== by 0x40107DA: _dl_init (dl-init.c:120)
==22169== by 0x4000C69: ??? (in /lib/x86_64-linux-gnu/ld-2.23.so)
==22169== Uninitialised value was created
==22169== at 0x39FA2E19: brk (brk.c:31)
==22169== by 0x39FA2EF8: sbrk (sbrk.c:58)
==22169== by 0x39F2D8C8: __default_morecore (morecore.c:47)
==22169== by 0x39F27634: sysmalloc (malloc.c:2484)
==22169== by 0x39F28742: _int_malloc (malloc.c:3827)
==22169== by 0x39F2B897: __libc_malloc (malloc.c:2913)
==22169== by 0x39F2B897: malloc_hook_ini (hooks.c:32)
==22169== by 0x39ED42CF: set_binding_values (bindtextdom.c:202)
==22169== by 0x39ED42CF: bindtextdomain (bindtextdom.c:320)
==22169== by 0x5C78DC56: ??? (in /lib/x86_64-linux-gnu/libgpg-error.so.0.17.0)
==22169== by 0x40106C9: call_init.part.0 (dl-init.c:72)
==22169== by 0x40107DA: call_init (dl-init.c:30)
==22169== by 0x40107DA: _dl_init (dl-init.c:120)
==22169== by 0x4000C69: ??? (in /lib/x86_64-linux-gnu/ld-2.23.so)
==22169==
==22169==
==22169== HEAP SUMMARY:
==22169== in use at exit: 0 bytes in 0 blocks
==22169== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==22169==
==22169== All heap blocks were freed -- no leaks are possible
==22169==
==22169== For counts of detected and suppressed errors, rerun with: -v
==22169== ERROR SUMMARY: 5 errors from 1 contexts (suppressed: 0 from 0)

samsk commented

This is related to how how log-malloc is allocating when using posix_memalig. Internaly it is using simple malloc, because I didn't wanted to waste too much memory for the head record and therfore the memory is not really aligned as it might be expected from the app.

Unfortunately, I don't have time right now to fix this immediately, but I'm adding it to my todo list...

posix_memalign implementation here has a bug. it never writes allocated memory address to the resulting pointer address.