ekmett/reflection

Crash in test suite on macOS with GHC 9.0.2

Closed this issue · 13 comments

$ lldb ./dist/build/spec/spec
(lldb) target create "./dist/build/spec/spec"
Current executable set to './dist/build/spec/spec' (x86_64).
(lldb) run
Process 36822 launched: './dist/build/spec/spec' (x86_64)

ReifyNat
  reifyNat
    reify positive Integers and reflect them back
      +++ OK, passed 100 tests.
Process 36822 stopped
* thread #1: tid = 0x3c7305, 0x00000001001893a8 spec`stg_ap_0_fast + 8, name = 'ghc_ticker', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x00000001001893a8 spec`stg_ap_0_fast + 8
spec`stg_ap_0_fast:
->  0x1001893a8 <+8>:  movslq -0x8(%rax), %rcx
    0x1001893ac <+12>: cmpq   $0x1a, %rcx
    0x1001893b0 <+16>: jb     0x1001893cc               ; <+44>
    0x1001893b2 <+18>: cmpq   $0x1c, %rcx
(lldb) bt all
* thread #1: tid = 0x3c7305, 0x00000001001893a8 spec`stg_ap_0_fast + 8, name = 'ghc_ticker', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00000001001893a8 spec`stg_ap_0_fast + 8
    frame #1: 0x0000004200021818

  thread #2: tid = 0x3c731c, 0x00007fff5d49bd82 libsystem_kernel.dylib`__semwait_signal + 10
    frame #0: 0x00007fff5d49bd82 libsystem_kernel.dylib`__semwait_signal + 10
    frame #1: 0x00007fff5d416724 libsystem_c.dylib`nanosleep + 199
    frame #2: 0x000000010016e35b spec`rtsSleep + 91
    frame #3: 0x00000001001878aa spec`itimer_thread_func + 74
    frame #4: 0x00007fff5d663661 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff5d66350d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff5d662bf9 libsystem_pthread.dylib`thread_start + 13
  • OS: Darwin xxx 17.7.0 Darwin Kernel Version 17.7.0: Fri Oct 30 13:34:27 PDT 2020; root:xnu-4570.71.82.8~1/RELEASE_X86_64 x86_64
  • Packages:
$ ghc-pkg list
/nix/store/6cdybgdmv7fj4n1lnaxfwcdabb72sdy3-ghc-9.0.2-with-packages/lib/ghc-9.0.2/package.conf.d
    Cabal-3.4.1.0
    HUnit-1.6.2.0
    QuickCheck-2.14.2
    ansi-terminal-0.11.1
    array-0.5.4.0
    base-4.15.1.0
    binary-0.8.8.0
    bytestring-0.10.12.1
    call-stack-0.4.0
    clock-0.8.3
    colour-2.3.6
    containers-0.6.4.1
    deepseq-1.4.5.0
    directory-1.3.6.2
    exceptions-0.10.4
    filepath-1.4.2.1
    ghc-9.0.2
    ghc-bignum-1.1
    ghc-boot-9.0.2
    ghc-boot-th-9.0.2
    ghc-compact-0.1.0.0
    ghc-heap-9.0.2
    ghc-prim-0.7.0
    ghci-9.0.2
    haskeline-0.8.2
    hpc-0.6.1.0
    hspec-2.8.5
    hspec-core-2.8.5
    hspec-discover-2.8.5
    hspec-expectations-0.8.2
    integer-gmp-1.1
    libiserv-9.0.2
    mtl-2.2.2
    parsec-3.1.14.0
    pretty-1.1.3.6
    primitive-0.7.3.0
    process-1.6.13.2
    quickcheck-io-0.2.0
    random-1.2.1
    rts-1.0.2
    setenv-0.1.1.3
    splitmix-0.1.0.4
    stm-2.5.0.0
    template-haskell-2.17.0.0
    terminfo-0.4.1.5
    text-1.2.5.0
    tf-random-0.5
    time-1.9.3
    transformers-0.5.6.2
    unix-2.7.2.2
    xhtml-3000.2.2.1

Can you report this to the upstream GHC issue tracker at https://gitlab.haskell.org/ghc/ghc/-/issues ? I can't reproduce this on Linux, and since there's nothing in this library that is OS-specific (AFAICT), I can't imagine that reflection is specifically at fault here.

Is this an m1 Mac?

Agreed that it’s likely a compiler bug. I’ll try to repro it

Is this an m1 Mac?

x86_64

Ok. What OS X release? I can lookup the kernel number but it makes it easier :)

ProductName:	Mac OS X
ProductVersion:	10.13.6
BuildVersion:	17G14042

But this also failed in the same way on different macs (NixOS's build machines) which probably run different macOS versions, but likely the same versions of libSystem would be involved.

reflection version is 2.1.6 from hackage btw.

Upon further examination, I can reproduce this crash—or at least, the Linux equivalent of this crash—using GHC 9.0 in particular:

$ cabal run test:spec -w ghc-9.0.2
Up to date

ReifyNat
  reifyNat
    reify positive Integers and reflect them back [✔]
      +++ OK, passed 100 tests.
    should throw an Underflow exception on negative inputs [ ]Segmentation fault (core dumped)

I had previously tried GHC 8.10.7 and 9.2.1, which don't exhibit this bug, and mistakenly concluded it applied to GHC 9.0 as well. That'll teach me to make assumptions! I've opened an issue upstream here.

Thanks a lot for looking into this so thoroughly! Very ominous issue indeed.

9.0.2 is somewhat important though, as the next Stackage LTS release will use it. Interestingly it seems that they haven't yet been able to trigger the crash in their build infrastructure.

Stack LTS better help with compiler bugs then :)

GHC#21141, the upstream GHC issue about this bug, has been fixed, and the fix is present in GHC 9.2.2. Since this is squarely a GHC bug, not a reflection one, I'm going to close this issue.