gnustep/libobjc2

ObjCXXEHInterop tests fail on Android

Closed this issue · 18 comments

Running the ObjCXXEHInterop* tests on Android ARMv7 fails with the following messages (libobjc2 was compiled with DEBUG_EXCEPTIONS=1). The same tests succeed on ARM64.

RUNNING ObjCXXEHInterop...
Poking from minRepM
Raising MyException
Throwing 0xea817014
Caught - re-raising
Throwing 0xea817014
New personality function called 0xea847080
Class: GNUCOBJC
LSDA: 0xac8dcd10
Search phase...
Filter: 1
Class name: Test
0xac8df1d0 type: 1
found handler for Test
handler: 4
Found handler! 4
Aborted 
FAILED: ObjCXXEHInterop

RUNNING ObjCXXEHInterop_legacy...
Poking from minRepM
Raising MyException
Throwing 0xee317014
Caught - re-raising
Throwing 0xee317014
New personality function called 0xee355080
Class: GNUCOBJC
LSDA: 0xb2111d10
Search phase...
Filter: 1
Class name: Test
0xee3420a0 type: 1
found handler for Test
handler: 4
Found handler! 4
Aborted 
FAILED: ObjCXXEHInterop_legacy

RUNNING ObjCXXEHInterop_legacy_optimised...
Poking from minRepM
Raising MyException
Throwing 0xf2397014
Caught - re-raising
Throwing 0xf2397014
New personality function called 0xf23d5080
Class: GNUCOBJC
LSDA: 0xb9c71d2c
Search phase...
Filter: 1
Class name: Test
0xf23c20a0 type: 1
found handler for Test
handler: 4
Found handler! 4
Aborted 
FAILED: ObjCXXEHInterop_legacy_optimised

RUNNING ObjCXXEHInterop_optimised...
Poking from minRepM
Raising MyException
Throwing 0xef817014
Caught - re-raising
Throwing 0xef817014
New personality function called 0xef847080
Class: GNUCOBJC
LSDA: 0xb2f5ed18
Search phase...
Filter: 1
Class name: Test
0xb2f611d0 type: 1
found handler for Test
handler: 4
Found handler! 4
Aborted 
FAILED: ObjCXXEHInterop_optimised

As I’m not really familiar with Objective C/C++ exception handling mechanisms I’d appreciate any pointers to track this down further.

This is with libobjc2 and PathScale’s libcxxrt compiled using the GNUstep Android toolchain, i.e. using the latest master versions built with clang from the Android NDK.

Sad, using a C++ runtime that I wrote means that I can't blame someone else...

It looks as if it's calling the Objective-C personality function in the search phase, that's finding a handler, but then the generic unwinder isn't doing anything with it and the generic unwinder is crashing. This test was passing on FreeBSD/ARMv7, so it might be something specific to the libUnwind that Android uses. When libcxxrt is building on Android, does it use the ARM or Itanium ABI for exceptions?

When libcxxrt is building on Android, does it use the ARM or Itanium ABI for exceptions?

It’s going for the "unwind-arm.h" include in unwind.h.

unwind-arm.h is used with an ARM unwinder, but there are some ifdefs that determine the layout of the exception structure and LSDA. These may be mismatched between libcxxrt and libobjc2.

There are indeed some mismatches between unwind-arm.h in libobjc2 and libcxxrt, although not in the exception structure itself. libobjc2 added support for forced unwinding in 2014, whereas libcxxrt added support for _US_ACTION_MASK in 2016.

Could this mismatch be related to this issue? If so I’ll try aligning the headers.

I tried aligning the unwind-arm.h files between libobjc2 and libcxxrt to no avail, but I’m also not sure if that’s the place you were referring to regarding the exception structure layout.

However I was now able to symbolize the crash log, which I hope might shed some light:

********** Crash dump: **********
Build fingerprint: 'samsung/dreamltexx/dreamlte:9/PPR1.180610.011/G950FXXS4DSE1:user/release-keys'
#00 0x0001ce62 /system/lib/libc.so (abort+58)
#01 0x000016f9 ObjCXXEHInterop	unwind_phase2(unw_context_t*, unw_cursor_t*, _Unwind_Control_Block*, bool)	libunwind_llvm/src/Unwind-EHABI.cpp:648:9
#02 0x00001793 ObjCXXEHInterop	_Unwind_Resume	libunwind_llvm/src/Unwind-EHABI.cpp:715:3
#03 0x00001007 ObjCXXEHInterop	poke_objcxx	ObjCXXEHInterop.mm:13:5
#04 0x0000104f ObjCXXEHInterop	main	ObjCXXEHInterop.m:16:5
#05 0x0008c2ed /system/lib/libc.so (__libc_init+48)
#06 0x00000e34 ObjCXXEHInterop
_start_main

So the crash happens in Unwind-EHABI.cpp:648 due to _URC_FAILURE.

I meant specifically the C++ exception structure things, specifically these ifdefs: https://github.com/pathscale/libcxxrt/blob/f96846efbfd508f66d91fcbbef5dd808947c7f6d/src/cxxabi.h#L113

Thanks. As far as I can see the ARM-specific fields match (see https://github.com/gnustep/libobjc2/blob/master/objcxx_eh.cc#L85). libcxxrt does have an extra "referenceCount" field at the beginning or end of the struct. I tried adding that in the same way in libobjc2 but it didn’t change anything.

To clarify my question: When libcxxrt is built for your toolchain, is __ARM_DWARF_EH__ defined?

It looks like __ARM_DWARF_EH__ is not defined, so the code surrounded by #if defined(__arm__) && !defined(__ARM_DWARF_EH__) conditionals is compiled.

It’s unclear to my why it’s not defined though, as Itanium is listed as supported by armeabi being used here (however I just started reading up on this...).

Comparing libobjc2 and libcxxrt, it looks like unwind.h is missing a !defined(__ARM_DWARF_EH__) here, although it should not matter for this issue. Other than that I didn’t spot any significant differences so far, but I could very well be missing something.

I ran these tests on Android x86 and x86_64 and they fail there as well, although with a somewhat different output:

./ObjCXXEHInterop                                         <
Poking from minRepM
Raising MyException
Throwing 0x73d38202c048
Segmentation fault

The stack trace looks as follows (identical for both ABIs):

libc    : Fatal signal 11 (SIGSEGV), code 128 (SI_KERNEL), fault addr 0x0 in tid 8639 (ObjCXXEHInterop), pid 8639 (ObjCXXEHInterop)
DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
DEBUG   : Build fingerprint: 'google/sdk_gphone_x86_64/generic_x86_64:9/PSR1.180720.093/5456446:userdebug/dev-keys'
DEBUG   : Revision: '0'
DEBUG   : ABI: 'x86'
DEBUG   : pid: 8639, tid: 8639, name: ObjCXXEHInterop  >>> ./ObjCXXEHInterop <<<
DEBUG   : signal 11 (SIGSEGV), code 128 (SI_KERNEL), fault addr 0x0
DEBUG   :     eax f23cc024  ebx f26efe10  ecx 474e5543  edx 00000001
DEBUG   :     edi 432b2b00  esi f23cb010
DEBUG   :     ebp ffd086a8  esp ffd08680  eip f26e08d2
unwind  : Malformed section header found, ignoring...

Backtrace:
#00 0x000158d2 libobjc.so __gnustep_objcxx_personality_v0 eh_personality.c:498:4
#01 0x00016489 libcxxrt.so _Unwind_RaiseException gcc-4.9/libgcc/unwind.inc:113:0
#02 0x000150f3 libobjc.so objc_exception_throw eh_personality.c:177:28
#03 0x00000e66 ObjCXXEHInterop
#04 0x00000fd4 ObjCXXEHInterop
#05 0x000ccf74 /system/lib/libc.so (__libc_init+100)

So it looks like it’s crashing in this memcpy(). Does that shed some light on this?

That's surprising. It looks like it's a null pointer, so either ex->cxx_exception or exceptionObject is null. exceptionObject is the argument to the function and we've already dereferenced it, so that shouldn't be null. That leaves ex->cxx_exception, by that comes from __cxa_allocate_exception, which was also dereferenced previously, so I'm not sure where the null pointer is coming from. Can you step through that function and see?

Thanks, I’ll dig into it and try to find a way to run these through a debugger on Android.

I was able to fix the test on Android x86 and x86_64 in #129. It was due to a misalignment of the __cxa_exception and _Unwind_Exception structs between libobjc2 and libcxxrt.

However I haven’t been able to get any further with the test failing on ARMv7 (even though the PR also updates the alignment of _Unwind_Exception to match on ARM, which seemed incorrect but does not fix or alter the test).

Thank you for looking into this! Unfortunately the ObjCXXEHInterop is still failing with the latest libobjc2 and libcxxrt master on the Android x86 simulator (haven’t checked other ABIs yet):

Program received signal SIGSEGV, Segmentation fault.
0xf7cad585 in objc_init_cxx_exception (obj=0xf7b1b014) at ../objcxx_eh.cc:207
207		while (*ehcls != cxx_exception_class)
(gdb) bt
#0  0xf7cad585 in objc_init_cxx_exception (obj=0xf7b1b014) at ../objcxx_eh.cc:207
#1  0xf7ca5b32 in __gnustep_objcxx_personality_v0 (version=1, actions=1, exceptionClass=5138137972457228867, exceptionObject=0xf7b4b010, context=0xffffd720) at ../eh_personality.c:494
#2  0xf7dde48a in _Unwind_RaiseException (exc=0xf7b4b010) at /Volumes/Android/buildbot/src/android/gcc/toolchain/build/../gcc/gcc-4.9/libgcc/unwind.inc:113
#3  0xf7ca5384 in objc_exception_throw (object=0xf7b1b014) at ../eh_personality.c:176
#4  0x56555e67 in poke_objcxx () at ../Test/ObjCXXEHInterop.mm:12
#5  0x56555fd5 in main () at ../Test/ObjCXXEHInterop.m:16
(gdb) p ehcls
$1 = <optimized out>
(gdb) p cxxexception
$2 = (void *) 0xf7b4c044

What would be the best way to debug this further?

It looks like the cxx_exception_class value is not 64-bit aligned inside cxxexception, but is split between ehcls - 2 and ehcls - 3:

(gdb) p/x cxx_exception_class
$15 = 0x474e5543432b2b00
(gdb) p/x *(((uint64_t *)cxxexception) - 2)
$16 = 0xf7dcfb30474e5543
                ^^^^^^^^
(gdb) p/x *(((uint64_t *)cxxexception) - 3)
$17 = 0x432b2b0000000001
        ^^^^^^^^

Hmm, that's odd. It should be strongly aligned. It's easy to fudge it to walk backwards a byte at a time, but I think I'd rather we just fix libcxxrt at this point if this code works with both the shipping FreeBSD version of libcxxrt (which it does) and libcxxrt with your PR (does it?).

Sounds good. I just ran the tests against the libcxxrt PR (libcxxrt/libcxxrt#1) and they pass (except BlockImpTest_optimised, but that’s probably a separate issue as it also failed before).

I also submitted #132 which I think shows the same issues we’re seeing on Android on FreeBSD when running the tests against libcxxrt master.