cmix segfaults (64-bit system) and throws std::bad_alloc (32-bit system)
Sfinx opened this issue · 9 comments
Hello,
Usual slackware 32-bit, latest git, throws std::bad_alloc always (even when called with non-existing file names) at predictor.cpp:175 when calling with -c (tried -g, -O3 and -Ofast -march=native).
Usual ubuntu 64-bit, latest git, got SIGSEGV with " -Ofast -march=native" but works perfectly with "-O3"
Thanks for the report. The 32-bit failure is expected: cmix tries to allocate more memory than supported by 32-bit machines.
The SIGSEGV for ubuntu 64-bit is not expected. Latest git works on my 64-bit ubuntu machine with "-Ofast -march=native". Does the failure happen consistently, for all files (even tiny files)? It will be hard for me to debug without more information since I can't reproduce on my computer. In case you are familiar with gdb, it might be helpful if you make a debug build ("make clean" + "make debug") and run within gdb to see a stack trace.
The test file that segfaults is attached. The stack trace of the program that compiled with -Ofast usually can't say anything, but here it is:
Program received signal SIGSEGV, Segmentation fault.
0x000000000043a178 in ?? ()
(gdb) backtrace
#0 0x000000000043a178 in ?? ()
#1 0x000000000043f19b in ?? ()
#2 0x000000000040e18f in ?? ()
#3 0x000000000040d1d9 in ?? ()
#4 0x0000000000443922 in ?? ()
#5 0x0000000000403dd8 in ?? ()
#6 0x00007ffff718ea40 in __libc_start_main (main=0x403b30, argc=4, argv=0x7fffffffe518,
init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe508)
at libc-start.c:289
#7 0x0000000000403f29 in ?? ()
Please note that cmix works okay when compiled with O <= 3 at my ubuntu.
Ah, makes sense that you won't get the stack trace with -Ofast. I am currently running a benchmark that will take a couple days to finish, but when I get a chance I will try testing the file you posted. Have you tried testing -Ofast with other files? Does it consistently segfault? Thanks!
Seems like it segfaults with any file (trying README and COPYING from the cmix git). This is why better to have the define -DDEBUG_LEVELX in debug build to be able to dump some trace. I'made the strace of the fault, last lines are :
open("README", O_RDONLY) = 3
open("o", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
lseek(3, 0, SEEK_END) = 2530
lseek(3, 0, SEEK_CUR) = 2530
lseek(3, 0, SEEK_SET) = 0
read(3, "cmix version 10\nhttp://www.byron"..., 8191) = 2530
mmap(NULL, 2166784, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f26fecf9000
mmap(NULL, 2166784, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f26feae8000
mmap(NULL, 8654848, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f26fe2a7000
mmap(NULL, 4329472, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f26fde86000
mmap(NULL, 4329472, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f26fda65000
mmap(NULL, 4160753664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2605a64000
mmap(NULL, 33558528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2603a63000
mmap(NULL, 33558528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2601a62000
mmap(NULL, 67112960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f25fda61000
mmap(NULL, 10043392, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f25fd0cd000
mmap(NULL, 4160753664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f25050cc000
mmap(NULL, 268439552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f24f50cb000
brk(0xc2cb4000) = 0xc2cb4000
brk(0xc2cf4000) = 0xc2cf4000
brk(0xc2d18000) = 0xc2d18000
brk(0xc2d48000) = 0xc2d48000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
hmm - still looks like it will be tricky to debug this. I guess one thing I can do is switch the default cmix compiler flag to -O3, and add a note to the readme that -Ofast is unreliable (but works on some computers). I'm not sure what to suggest to further debug this - perhaps try compiling earlier cmix versions and see if they have the same issue?
I've added -fsanitize=undefined and got this :
./cmix -c README o
/usr/include/c++/5/bits/valarray_array.h:155:9: runtime error: null pointer passed as argument 2, which is declared to never be null
src/models/paq8hp.cpp:1026:33: runtime error: signed integer overflow: 134217728 * 31 cannot be represented in type 'int'
src/models/paq8hp.cpp:830:34: runtime error: signed integer overflow: 134217728 * 31 cannot be represented in type 'int'
src/models/paq8hp.cpp:348:37: runtime error: load of misaligned address 0x0000ae6ccd10 for type '__m256i', which requires 32 byte alignment
0x0000ae6ccd10: note: pointer points here
00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
src/models/paq8hp.cpp:348:37: runtime error: load of misaligned address 0x7f1862f533b0 for type '__m256i', which requires 32 byte alignment
0x7f1862f533b0: note: pointer points here
00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02 00 02
^
Segmentation fault (core dumped)
May be this helps. I've found cmix too slow for my project so switched to custom compression algo.
Thanks for letting me know about "-fsanitize=undefined", this exposed multiple bugs. My latest commit fixed the problems. In case you have some free time, I would be curious of cmix with -Ofast works on your computer now.
Working like a charm, compiled with stock "-Ofast -march=native"
rus@Evo:~/cmix$ ./cmix -c README o
2530 bytes -> 890 bytes in 5.12 s.
cross entropy: 2.814
rus@Evo:~/cmix$