Release-built hoice sometimes crashes while learning on macOS Monterey
khei4 opened this issue · 7 comments
Input
(set-logic HORN)
(declare-fun
main@_6
( Int Int Int ) Bool
)
(declare-fun
main@.preheader
( Int Int Int Int ) Bool
)
(declare-fun
main@.lr.ph
( Int Int ) Bool
)
(assert
(forall
( (A Int) (C Int) (I Int) (J Int) )
(=>
(and
(>= (+ C (* (- 1) J)) 0) (>= (+ I (* (- 1) A)) 1)
(main@.preheader C A I J)
)
(main@.preheader (+ C (- 1)) (+ A 1) I J)
)
)
)
(assert
(forall
( (H Int) (I Int) )
(=>
(and
true
true
)
(main@_6 0 H I)
)
)
)
(assert
(forall
( (A Int) (B Int) (D Int) )
(=>
(and
(not (= D 0))
(main@.lr.ph A B)
)
(main@.lr.ph (+ A 1) (+ B 1))
)
)
)
(assert
(forall
( (B Int) (C Int) (L Int) (M Int) )
(=>
(and
(not (= B 0))
(main@_6 C L M)
)
(main@_6 (+ C 1) L M)
)
)
)
(assert
(forall
( (B Int) (C Int) (D Int) )
(=>
(and
(or (>= (+ C (* (- 1) B)) 1) (>= (+ D (* (- 1) C)) 0))
(main@_6 C B D)
)
(main@.preheader 0 0 (- 1) 1)
)
)
)
(assert
(forall
( (A Int) (B Int) (C Int) (H Int) )
(=>
(and
(or (>= (+ B (* (- 1) A)) 1) (>= (+ C (* (- 1) B)) 0)) (not (= H 0))
(main@_6 B A C)
)
(main@.lr.ph 0 0)
)
)
)
(assert
(forall
( (A Int) (C Int) (D Int) (G Int) )
(=>
(and
(>= (+ G (* (- 1) A)) 2) (>= (+ D (* (- 1) C)) 1)
(main@.preheader (+ A 1) C D G)
)
false
)
)
)
(assert
(forall
( (B Int) (I Int) )
(=>
(and
true
(main@.lr.ph I B)
)
(main@.preheader (+ B 1) 0 I 1)
)
)
)
(check-sat)
Output (sometimes)
hoice(62406,0x70000f03c000) malloc: *** error for object 0x600002c34bb0: pointer being freed was not allocated
hoice(62406,0x70000f03c000) malloc: *** set a breakpoint in malloc_error_break to debug
fish: Job 1, './target/release/hoice --log_...' terminated by signal SIGABRT (Abort)
Version
macOS Monterey 12.2.1
hoice 1.8.1 (manually built(1c9a5d110 2021-01-29)
z3 4.8.13
rustc 1.60.0
This error doesn't happen on Debian 11, z3 4.8.13, rustc 1.57.0
Stack Trace
hoice(63567,0x70000b73a000) malloc: *** error for object 0x600003500000: pointer being freed was not allocated
hoice(63567,0x70000b73a000) malloc: *** set a breakpoint in malloc_error_break to debug
thread #2, name = 'ice', stop reason = signal SIGABRT
frame #0: 0x00007ff805dd0112 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007ff805e06214 libsystem_pthread.dylib`pthread_kill + 263
frame #2: 0x00007ff805d52d10 libsystem_c.dylib`abort + 123
frame #3: 0x00007ff805c303e2 libsystem_malloc.dylib`malloc_vreport + 548
frame #4: 0x00007ff805c335ed libsystem_malloc.dylib`malloc_report + 151
frame #5: 0x00000001002754b7 hoice`hoice::data::Data::destroy::hb59ed86746733263 + 2887
frame #6: 0x00000001000c77a3 hoice`hoice::learning::ice::IceLearner::learn::h4fa19c30151517de + 1283
frame #7: 0x00000001000c6238 hoice`hoice::learning::ice::IceLearner::run::h847aee6c547ef342 + 1800
frame #8: 0x00000001000c43a4 hoice`hoice::learning::ice::Launcher::launch::h86dedf9dfc6a0166 + 436
frame #9: 0x0000000100336752 hoice`std::sys_common::backtrace::__rust_begin_short_backtrace::hf1caa9859d2433f7 + 162
frame #10: 0x000000010019faa2 hoice`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h3fd6720881912218 + 178
frame #11: 0x00000001003de767 hoice`std::sys::unix::thread::Thread::new::thread_start::h9b70539161da4eaf [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h76ed59d8775c95eb at boxed.rs:1853:9 [opt]
frame #12: 0x00000001003de761 hoice`std::sys::unix::thread::Thread::new::thread_start::h9b70539161da4eaf [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h9a39ecfa1322b383 at boxed.rs:1853:9 [opt]
frame #13: 0x00000001003de75a hoice`std::sys::unix::thread::Thread::new::thread_start::h9b70539161da4eaf at thread.rs:108:17 [opt]
frame #14: 0x00007ff805e064f4 libsystem_pthread.dylib`_pthread_start + 125
frame #15: 0x00007ff805e0200f libsystem_pthread.dylib`thread_start + 15
libsystem_kernel.dylib`__pthread_kill:
0x7ff805dd0108 <+0>: movl $0x2000148, %eax ; imm = 0x2000148
0x7ff805dd010d <+5>: movq %rcx, %r10
0x7ff805dd0110 <+8>: syscall
-> 0x7ff805dd0112 <+10>: jae 0x7ff805dd011c ; <+20>
0x7ff805dd0114 <+12>: movq %rax, %rdi
0x7ff805dd0117 <+15>: jmp 0x7ff805dca2f1 ; cerror_nocancel
0x7ff805dd011c <+20>: retq
0x7ff805dd011d <+21>: nop
0x7ff805dd011e <+22>: nop
0x7ff805dd011f <+23>: nop
rax = 0x0000000000000000
rdx = 0x0000000000000000
rbx = 0x000070000b73a000
rbp = 0x000070000b737ab0
rsp = 0x000070000b737a88
Wow. Thank you for this detailed issue!
This looks like a bug far underneath hoice itself. I'll try my best to find time to investigate, feel free to ping me if I don't get back to you "soon".
The last frame of the stack trace is hoice::data::Data::destroy
, which does nothing but extract statistics for profiling. So it's pretty weird, but the good news is that I don't think it will take a long time to identify whether the problem is in hoice itself or in something below (a Rust library hoice uses, Rust's stdlib itself, rustc
or macos).
@khei4 I could not reproduce the bug 😢
I pushed a new version of hoice though, totally unrelated to this issue. I tested your clauses with it running z3 4.8.17 and nothing went wrong on Monterey 12.3.1.
Let me know if updating hoice + z3 fixes the problem, just in case. In the mean time I'm going to close this issue as I cannot reproduce it. Do feel free to re-open it if new clues come up.
@AdrienChampion Thank you! I can reproduce this error with HoIce(commit hash: 90d348933) for the same input on my MacBookPro (2018 model with Intel Core i5) with the same version z3 as you write. I guess this bug is related to the concurrency and happens non deterministically(sometimes nothing happens on this). Could you reproduce this by executing repeatedly? If nothing happens, maybe some of my dependencies are the cause of this...
I did a while true
running hoice on your clauses on each iteration for a while and failed to crash it. I ran on M1 though, not on Intel. I do have a macbook pro running Monterey on Intel, though I believe it's an i7.
Hmm, I wrote hoice quite a while ago and I thought I did not use unsafe
, but I just took a look and I actually do use unsafe to implement Send
and Sync
(concurrency-related traits) on some types. The fact that I have to do this means Rust cannot show automatically that these types are thread-safe, so maybe there's a problem there.
Outside of a bug in the compiler/LLVM/your machine, it's relatively safe (!) to assume the problem comes from an unsafe
block somewhere.
I still find it weird that I cannot reproduce this behavior.
I will investigate further but I need some time, probably a few days. Please ping me in this thread if this issue is time-sensitive for you 🐱
In the mean time, thank you again for opening the issue and keeping in touch to help solve it!
(Re-opening the issue.)
@AdrienChampion Thank you! I'm not in haste. So you don't need to be careful about the time 😃
Now you really shouldn't say that, I'm going to pretend I did not see anything otherwise it will take centuries 🐱