hopv/hoice

Release-built hoice sometimes crashes while learning on macOS Monterey

khei4 opened this issue · 7 comments

khei4 commented

Input

(set-logic HORN)

(declare-fun
  main@_6
  ( Int Int Int ) Bool
)
(declare-fun
  main@.preheader
  ( Int Int Int Int ) Bool
)
(declare-fun
  main@.lr.ph
  ( Int Int ) Bool
)

(assert 
  (forall
    ( (A Int) (C Int) (I Int) (J Int) )
    (=>
      (and
        (>= (+ C (* (- 1) J)) 0) (>= (+ I (* (- 1) A)) 1)
        (main@.preheader C A I J)
      )
      (main@.preheader (+ C (- 1)) (+ A 1) I J)
    )
  )
)

(assert 
  (forall
    ( (H Int) (I Int) )
    (=>
      (and
        true
        true
      )
      (main@_6 0 H I)
    )
  )
)

(assert 
  (forall
    ( (A Int) (B Int) (D Int) )
    (=>
      (and
        (not (= D 0))
        (main@.lr.ph A B)
      )
      (main@.lr.ph (+ A 1) (+ B 1))
    )
  )
)

(assert 
  (forall
    ( (B Int) (C Int) (L Int) (M Int) )
    (=>
      (and
        (not (= B 0))
        (main@_6 C L M)
      )
      (main@_6 (+ C 1) L M)
    )
  )
)

(assert 
  (forall
    ( (B Int) (C Int) (D Int) )
    (=>
      (and
        (or (>= (+ C (* (- 1) B)) 1) (>= (+ D (* (- 1) C)) 0))
        (main@_6 C B D)
      )
      (main@.preheader 0 0 (- 1) 1)
    )
  )
)

(assert 
  (forall
    ( (A Int) (B Int) (C Int) (H Int) )
    (=>
      (and
        (or (>= (+ B (* (- 1) A)) 1) (>= (+ C (* (- 1) B)) 0)) (not (= H 0))
        (main@_6 B A C)
      )
      (main@.lr.ph 0 0)
    )
  )
)
(assert 
  (forall
    ( (A Int) (C Int) (D Int) (G Int) )
    (=>
      (and
        (>= (+ G (* (- 1) A)) 2) (>= (+ D (* (- 1) C)) 1)
        (main@.preheader (+ A 1) C D G)
      )
      false
    )
  )
)
(assert 
  (forall
    ( (B Int) (I Int) )
    (=>
      (and
        true
        (main@.lr.ph I B)
      )
      (main@.preheader (+ B 1) 0 I 1)
    )
  )
)
(check-sat)

Output (sometimes)

hoice(62406,0x70000f03c000) malloc: *** error for object 0x600002c34bb0: pointer being freed was not allocated
hoice(62406,0x70000f03c000) malloc: *** set a breakpoint in malloc_error_break to debug
fish: Job 1, './target/release/hoice --log_...' terminated by signal SIGABRT (Abort)

Version

macOS Monterey 12.2.1
hoice 1.8.1 (manually built(1c9a5d110 2021-01-29)
z3 4.8.13
rustc 1.60.0

This error doesn't happen on Debian 11, z3 4.8.13, rustc 1.57.0

Stack Trace

hoice(63567,0x70000b73a000) malloc: *** error for object 0x600003500000: pointer being freed was not allocated
hoice(63567,0x70000b73a000) malloc: *** set a breakpoint in malloc_error_break to debug
  thread #2, name = 'ice', stop reason = signal SIGABRT
    frame #0: 0x00007ff805dd0112 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007ff805e06214 libsystem_pthread.dylib`pthread_kill + 263
    frame #2: 0x00007ff805d52d10 libsystem_c.dylib`abort + 123
    frame #3: 0x00007ff805c303e2 libsystem_malloc.dylib`malloc_vreport + 548
    frame #4: 0x00007ff805c335ed libsystem_malloc.dylib`malloc_report + 151
    frame #5: 0x00000001002754b7 hoice`hoice::data::Data::destroy::hb59ed86746733263 + 2887
    frame #6: 0x00000001000c77a3 hoice`hoice::learning::ice::IceLearner::learn::h4fa19c30151517de + 1283
    frame #7: 0x00000001000c6238 hoice`hoice::learning::ice::IceLearner::run::h847aee6c547ef342 + 1800
    frame #8: 0x00000001000c43a4 hoice`hoice::learning::ice::Launcher::launch::h86dedf9dfc6a0166 + 436
    frame #9: 0x0000000100336752 hoice`std::sys_common::backtrace::__rust_begin_short_backtrace::hf1caa9859d2433f7 + 162
    frame #10: 0x000000010019faa2 hoice`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h3fd6720881912218 + 178
    frame #11: 0x00000001003de767 hoice`std::sys::unix::thread::Thread::new::thread_start::h9b70539161da4eaf [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h76ed59d8775c95eb at boxed.rs:1853:9 [opt]
    frame #12: 0x00000001003de761 hoice`std::sys::unix::thread::Thread::new::thread_start::h9b70539161da4eaf [inlined] _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h9a39ecfa1322b383 at boxed.rs:1853:9 [opt]
    frame #13: 0x00000001003de75a hoice`std::sys::unix::thread::Thread::new::thread_start::h9b70539161da4eaf at thread.rs:108:17 [opt]
    frame #14: 0x00007ff805e064f4 libsystem_pthread.dylib`_pthread_start + 125
    frame #15: 0x00007ff805e0200f libsystem_pthread.dylib`thread_start + 15
libsystem_kernel.dylib`__pthread_kill:
    0x7ff805dd0108 <+0>:  movl   $0x2000148, %eax          ; imm = 0x2000148 
    0x7ff805dd010d <+5>:  movq   %rcx, %r10
    0x7ff805dd0110 <+8>:  syscall 
->  0x7ff805dd0112 <+10>: jae    0x7ff805dd011c            ; <+20>
    0x7ff805dd0114 <+12>: movq   %rax, %rdi
    0x7ff805dd0117 <+15>: jmp    0x7ff805dca2f1            ; cerror_nocancel
    0x7ff805dd011c <+20>: retq   
    0x7ff805dd011d <+21>: nop    
    0x7ff805dd011e <+22>: nop    
    0x7ff805dd011f <+23>: nop
     rax = 0x0000000000000000
     rdx = 0x0000000000000000
     rbx = 0x000070000b73a000
     rbp = 0x000070000b737ab0
     rsp = 0x000070000b737a88

Wow. Thank you for this detailed issue!

This looks like a bug far underneath hoice itself. I'll try my best to find time to investigate, feel free to ping me if I don't get back to you "soon".

The last frame of the stack trace is hoice::data::Data::destroy, which does nothing but extract statistics for profiling. So it's pretty weird, but the good news is that I don't think it will take a long time to identify whether the problem is in hoice itself or in something below (a Rust library hoice uses, Rust's stdlib itself, rustc or macos).

@khei4 I could not reproduce the bug 😢

I pushed a new version of hoice though, totally unrelated to this issue. I tested your clauses with it running z3 4.8.17 and nothing went wrong on Monterey 12.3.1.

Let me know if updating hoice + z3 fixes the problem, just in case. In the mean time I'm going to close this issue as I cannot reproduce it. Do feel free to re-open it if new clues come up.

khei4 commented

@AdrienChampion Thank you! I can reproduce this error with HoIce(commit hash: 90d348933) for the same input on my MacBookPro (2018 model with Intel Core i5) with the same version z3 as you write. I guess this bug is related to the concurrency and happens non deterministically(sometimes nothing happens on this). Could you reproduce this by executing repeatedly? If nothing happens, maybe some of my dependencies are the cause of this...

I did a while true running hoice on your clauses on each iteration for a while and failed to crash it. I ran on M1 though, not on Intel. I do have a macbook pro running Monterey on Intel, though I believe it's an i7.

Hmm, I wrote hoice quite a while ago and I thought I did not use unsafe, but I just took a look and I actually do use unsafe to implement Send and Sync (concurrency-related traits) on some types. The fact that I have to do this means Rust cannot show automatically that these types are thread-safe, so maybe there's a problem there.
Outside of a bug in the compiler/LLVM/your machine, it's relatively safe (!) to assume the problem comes from an unsafe block somewhere.

I still find it weird that I cannot reproduce this behavior.

I will investigate further but I need some time, probably a few days. Please ping me in this thread if this issue is time-sensitive for you 🐱
In the mean time, thank you again for opening the issue and keeping in touch to help solve it!

(Re-opening the issue.)

khei4 commented

@AdrienChampion Thank you! I'm not in haste. So you don't need to be careful about the time 😃

Now you really shouldn't say that, I'm going to pretend I did not see anything otherwise it will take centuries 🐱