trilemma-dev/SecureXPC

Helper crashing in Catalina & Big Sur.

Closed this issue · 9 comments

As per title, my helper is crashing in Catalina & Big Sur. It is fine in Monterey. After launching it runs a bunch of route requests fine, but then crashes with EXC_BAD_ACCESS.

It doesn't seem to crash in any of my route handling code nor on any particular route call, so I'm assuming it must be something to do with my route definitions, although I can't see what. I am wondering if anything jumps out to you in the crash log?

Click to expand crash log (truncated to fit allowed GitHub post size):
Process:               com.test.AppHelper.Debug [3028]
Path:                  /Library/PrivilegedHelperTools/com.test.AppHelper.Debug
Identifier:            com.test.AppHelper.Debug
Version:               1.0.0-beta.4 (100)
Code Type:             X86-64 (Native)
Parent Process:        launchd [1]
Responsible:           com.test.AppHelper.Debug [3028]
User ID:               0

Date/Time:             2022-05-01 18:11:55.299 +1200
OS Version:            macOS 11.6 (20G165)
Report Version:        12
Anonymous UUID:        DDCC3E05-CCE8-DB38-43E5-8378F721024C

Sleep/Wake UUID:       C6BAE207-875A-4EB8-BF73-823D7FDC7C8E

Time Awake Since Boot: 15000 seconds
Time Since Wake:       6900 seconds

System Integrity Protection: enabled

Crashed Thread:        1  Dispatch queue: com.apple.root.default-qos.overcommit

Exception Type:        EXC_BAD_ACCESS (SIGABRT)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

VM Regions Near 0:
--> 
	__TEXT                      10027c000-10070c000    [ 4672K] r-x/r-x SM=COW  /Library/PrivilegedHelperTools/*.Debug

Application Specific Information:
=================================================================
==3028==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fff2ca35dc7 bp 0x7000037a0820 sp 0x7000037a0800 T2)
==3028==The signal is caused by a READ memory access.
==3028==Hint: address points to the zero page.
	#0 0x7fff2ca35dc7 in swift::ResolveAsSymbolicReference::operator()(swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)+0x37 (libswiftCore.dylib:x86_64+0x33cdc7)
	#1 0x7fff2ca582dc in swift::Demangle::__runtime::Demangler::demangleSymbolicReference(unsigned char)+0x8c (libswiftCore.dylib:x86_64+0x35f2dc)
	#2 0x7fff2ca552a7 in swift::Demangle::__runtime::Demangler::demangleType(__swift::__runtime::llvm::StringRef, std::__1::function<swift::Demangle::__runtime::Node* (swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)>)+0xa7 (libswiftCore.dylib:x86_64+0x35c2a7)
	#3 0x7fff2ca3b5a3 in swift_getTypeByMangledNameImpl(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<swift::TargetMetadata<swift::InProcess> const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>)+0x203 (libswiftCore.dylib:x86_64+0x3425a3)
	#4 0x7fff2ca38d6c in swift::swift_getTypeByMangledName(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<swift::TargetMetadata<swift::InProcess> const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>)+0x1dc (libswiftCore.dylib:x86_64+0x33fd6c)
	#5 0x7fff2ca38f9a in swift_getTypeByMangledNameInContext+0xaa (libswiftCore.dylib:x86_64+0x33ff9a)
	#6 0x100553ee8 in __swift_instantiateConcreteTypeFromMangledName+0x58 (com.test.AppHelper.Debug:x86_64+0x1002d7ee8)
	#7 0x7fff201a2805 in _dispatch_client_callout+0x7 (libdispatch.dylib:x86_64+0x3805)
	#8 0x7fff201a398b in _dispatch_once_callout+0x13 (libdispatch.dylib:x86_64+0x498b)
	#9 0x7fff2ca47169 in swift_once+0x19 (libswiftCore.dylib:x86_64+0x34e169)
	#10 0x100678f80 in XPCRequestContext._currentForTask.unsafeMutableAddressor+0x30 (com.test.AppHelper.Debug:x86_64+0x1003fcf80)
	#11 0x1006792d3 in $s9SecureXPC17XPCRequestContextC10setForTask10connection7message9operationScTyxq_GSo13OS_xpc_object_p_SoAI_pAHyKXEtKs5ErrorR_r0_lFZ+0x343 (com.test.AppHelper.Debug:x86_64+0x1003fd2d3)
	#12 0x1005c8191 in XPCServer.handleMessage(connection:message:)+0x1b91 (com.test.AppHelper.Debug:x86_64+0x10034c191)
	#13 0x1005c59e6 in XPCServer.handleEvent(connection:event:)+0x3e6 (com.test.AppHelper.Debug:x86_64+0x1003499e6)
	#14 0x1005c4c7a in closure #1 in XPCServer.startClientConnection(_:)+0x1aa (com.test.AppHelper.Debug:x86_64+0x100348c7a)
	#15 0x10053cd43 in thunk for @escaping @callee_guaranteed (@guaranteed OS_xpc_object) -> ()+0x93 (com.test.AppHelper.Debug:x86_64+0x1002c0d43)
	#16 0x7fff20092c23 in _xpc_connection_call_event_handler+0x37 (libxpc.dylib:x86_64+0xcc23)
	#17 0x7fff20091a9a in _xpc_connection_mach_event+0x3a9 (libxpc.dylib:x86_64+0xba9a)
	#18 0x7fff201a28a5 in _dispatch_client_callout4+0x8 (libdispatch.dylib:x86_64+0x38a5)
	#19 0x7fff201b9a9f in _dispatch_mach_msg_invoke+0x1bb (libdispatch.dylib:x86_64+0x1aa9f)
	#20 0x7fff201a8492 in _dispatch_lane_serial_drain+0x106 (libdispatch.dylib:x86_64+0x9492)
	#21 0x7fff201ba5e1 in _dispatch_mach_invoke+0x1e3 (libdispatch.dylib:x86_64+0x1b5e1)
	#22 0x7fff201b2c0c in _dispatch_workloop_worker_thread+0x32a (libdispatch.dylib:x86_64+0x13c0c)
	#23 0x7fff2034945c in _pthread_wqthread+0x139 (libsystem_pthread.dylib:x86_64+0x345c)
	#24 0x7fff2034842e in start_wqthread+0xe (libsystem_pthread.dylib:x86_64+0x242e)
 
==3028==Register values:
rax = 0x0000000000000001  rbx = 0x00007000037a0b88  rcx = 0x000000000003556b  rdx = 0x0000000000000000  
rdi = 0x00007000037a0b88  rsi = 0x0000000000000000  rbp = 0x00007000037a0820  rsp = 0x00007000037a0800  
 r8 = 0x00000001006fd075   r9 = 0xffffe1ffff90bed0  r10 = 0x0000000000000001  r11 = 0x0000000000000000  
r12 = 0x000000000000000e  r13 = 0xffffffffffffffff  r14 = 0x0000000000000001  r15 = 0x0000000000000000  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (libswiftCore.dylib:x86_64+0x33cdc7) in swift::ResolveAsSymbolicReference::operator()(swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)+0x37
Thread T2 created by T0 here:
	<empty stack>
 
==3028==ABORTING
 
abort() called

Thread 0:
0   libsystem_pthread.dylib       	0x00007fff20348420 start_wqthread + 0

Thread 1 Crashed:: Dispatch queue: com.apple.root.default-qos.overcommit
0   libsystem_kernel.dylib        	0x00007fff2031d92e __pthread_kill + 10
1   libsystem_pthread.dylib       	0x00007fff2034c5bd pthread_kill + 263
2   libsystem_c.dylib             	0x00007fff202a1406 abort + 125
3   libclang_rt.asan_osx_dynamic.dylib	0x0000000100bc4916 __sanitizer::Abort() + 70
4   libclang_rt.asan_osx_dynamic.dylib	0x0000000100bc4044 __sanitizer::Die() + 196
5   libclang_rt.asan_osx_dynamic.dylib	0x0000000100baba66 __asan::ScopedInErrorReport::~ScopedInErrorReport() + 422
6   libclang_rt.asan_osx_dynamic.dylib	0x0000000100ba976d __asan::ReportDeadlySignal(__sanitizer::SignalContext const&) + 157
7   libclang_rt.asan_osx_dynamic.dylib	0x0000000100ba8eff __asan::AsanOnDeadlySignal(int, void*, void*) + 95
8   libsystem_platform.dylib      	0x00007fff20391d7d _sigtramp + 29
9   ???                           	000000000000000000 0 + 0
10  libswiftCore.dylib            	0x00007fff2ca582dd swift::Demangle::__runtime::Demangler::demangleSymbolicReference(unsigned char) + 141
11  libswiftCore.dylib            	0x00007fff2ca552a8 swift::Demangle::__runtime::Demangler::demangleType(__swift::__runtime::llvm::StringRef, std::__1::function<swift::Demangle::__runtime::Node* (swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)>) + 168
12  libswiftCore.dylib            	0x00007fff2ca3b5a4 swift_getTypeByMangledNameImpl(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<swift::TargetMetadata<swift::InProcess> const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 516
13  libswiftCore.dylib            	0x00007fff2ca38d6d swift::swift_getTypeByMangledName(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<swift::TargetMetadata<swift::InProcess> const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 477
14  libswiftCore.dylib            	0x00007fff2ca38f9b swift_getTypeByMangledNameInContext + 171
15  com.test.AppHelper.Debug	0x0000000100553ee9 __swift_instantiateConcreteTypeFromMangledName + 89
16  libdispatch.dylib             	0x00007fff201a2806 _dispatch_client_callout + 8
17  libdispatch.dylib             	0x00007fff201a398c _dispatch_once_callout + 20
18  libswiftCore.dylib            	0x00007fff2ca4716a swift_once + 26
19  com.test.AppHelper.Debug	0x0000000100678f81 XPCRequestContext._currentForTask.unsafeMutableAddressor + 49
20  com.test.AppHelper.Debug	0x00000001006792d4 $s9SecureXPC17XPCRequestContextC10setForTask10connection7message9operationScTyxq_GSo13OS_xpc_object_p_SoAI_pAHyKXEtKs5ErrorR_r0_lFZ + 836
21  com.test.AppHelper.Debug	0x00000001005c8192 XPCServer.handleMessage(connection:message:) + 7058
22  com.test.AppHelper.Debug	0x00000001005c59e7 XPCServer.handleEvent(connection:event:) + 999
23  com.test.AppHelper.Debug	0x00000001005c4c7b closure #1 in XPCServer.startClientConnection(_:) + 427
24  com.test.AppHelper.Debug	0x000000010053cd44 thunk for @escaping @callee_guaranteed (@guaranteed OS_xpc_object) -> () + 148
25  libxpc.dylib                  	0x00007fff20092c24 _xpc_connection_call_event_handler + 56
26  libxpc.dylib                  	0x00007fff20091a9b _xpc_connection_mach_event + 938
27  libdispatch.dylib             	0x00007fff201a28a6 _dispatch_client_callout4 + 9
28  libdispatch.dylib             	0x00007fff201b9aa0 _dispatch_mach_msg_invoke + 444
29  libdispatch.dylib             	0x00007fff201a8493 _dispatch_lane_serial_drain + 263
30  libdispatch.dylib             	0x00007fff201ba5e2 _dispatch_mach_invoke + 484
31  libdispatch.dylib             	0x00007fff201b2c0d _dispatch_workloop_worker_thread + 811
32  libsystem_pthread.dylib       	0x00007fff2034945d _pthread_wqthread + 314
33  libsystem_pthread.dylib       	0x00007fff2034842f start_wqthread + 15

Thread 2:
0   libsystem_kernel.dylib        	0x00007fff2031cb0a __sigsuspend_nocancel + 10
1   libdispatch.dylib             	0x00007fff201b34e1 _dispatch_sigsuspend + 36
2   libdispatch.dylib             	0x00007fff201b34bd _dispatch_sig_thread + 53

Thread 1 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000000  rbx: 0x00007000037a4000  rcx: 0x000070000379f658  rdx: 0x0000000000000000
  rdi: 0x0000000000001803  rsi: 0x0000000000000006  rbp: 0x000070000379f680  rsp: 0x000070000379f658
   r8: 0x0000000100c96107   r9: 0x0000000000000001  r10: 0x0000000000000000  r11: 0x0000000000000246
  r12: 0x0000000000001803  r13: 0xffffffffffffffff  r14: 0x0000000000000006  r15: 0x0000000000000016
  rip: 0x00007fff2031d92e  rfl: 0x0000000000000246  cr2: 0x000010002091bc00
  
Logical CPU:     0
Error Code:      0x02000148
Trap Number:     133

Thread 1 instruction stream:
  89 98 05 00 e9 df fe ff-ff e8 cb d9 fe ff 0f 1f  ................
  00 f8 ff ff ff d0 ff ff-ff c0 ff ff ff cb ff ff  ................
  ff c7 ff ff ff 0f 1f 40-00 55 48 89 e5 41 57 41  .......@.UH..AWA
  56 53 50 89 d0 48 89 fb-48 63 d1 4c 01 c2 83 f8  VSP..H..Hc.L....
  01 75 0e 40 84 f6 0f 85-8d 00 00 00 48 8b 12 eb  .u.@........H...
  0f 40 84 f6 74 0a 48 8b-3b be 03 01 00 00 eb 21  .@..t.H.;......!
 [8b]02 89 c1 80 e1 1f 80-f9 03 74 0b 80 f9 04 75  ..........t....u	<==
  1f 66 b8 05 01 eb 04 66-b8 98 00 48 8b 3b 0f b7  .f.....f...H.;..
  f0 48 83 c4 08 5b 41 5e-41 5f 5d e9 09 fa 01 00  .H...[A^A_].....
  45 31 f6 48 85 d2 74 33-83 e0 10 74 2e 48 8b 3b  E1.H..t3...t.H.;
  be c3 00 00 00 e8 ef f9-01 00 49 89 c7 48 8b 3b  ..........I..H.;
  be c2 00 00 00 e8 4f f9-01 00 49 89 c6 48 8b 13  ......O...I..H..
  
Thread 1 last branch register state not available.

It's looking like it crashes when I call any route that has an async handler. I'll experiment with that theory more in the morning.

You're correct, the stack trace makes it clear this is crashing while doing the setup work to call an async handler.

I did a bit of investigation, and unfortunately the answer is this can't work due to Apple failing to fully back port Swift concurrency. See this Apple Developer Forums discussion. As a result of this, I'll improve the documentation for SecureXPC, but only Apple can actually fix the issue.

The bottom line is whether or not your helper tool installed with SMJobBless uses SecureXPC, it can't use Swift concurrency on Catalina and Big Sur.

Here's what's going on:

  • Tools installed with SMJobBless must be command line tools, not app bundles
  • macOS Monterey has native support for Swift concurrency
  • macOS Catalina and Big Sur support Swift concurrency by having Xcode automatically add libswift_Concurrency.dylib as a framework to any app bundles targeting Catalina or Big Sur as their minimum version
  • Xcode doesn't (nor do I believe it could) add libswift_Concurrency.dylib as a framework for command line tools
  • When a command line tool tries to use Swift concurrency on Catalina or Big Sur, it crashes

The reason none of the testing caught this is async routes can be used on Catalina and Big Sur, but not from a command line tool. They work fine in XPC services, as anonymous servers inside of apps, in login items, etc. which are all app bundles. (All of the automated testing uses anonymous servers which run inside of Xcode's test harness.)

Wow, that makes sense. Thanks so much for your analysis — I really appreciate your time digging into it.

I did notice the bundled libswift_Concurrency.dylib in my app, and wondered how the helper handled that but put it down to lack of knowledge on my part :) I still don't quite understand why the helper tool can't have a dependency on libswift_Concurrency.dylib — I guess it's just a dylib path problem (ie where could it find this dylib?)

I'll just go through and convert my async handlers to use callbacks.

ps crikey, do you ever sleep? 🤪

I still don't quite understand why the helper tool can't have a dependency on libswift_Concurrency.dylib — I guess it's just a dylib path problem (ie where could it find this dylib?)

The helper tool once installed by SMJobBless is copied from your app bundle to /Library/PrivilegedHelperTools/. Once it's been relocated, there's no way for it to reference the libswift_Concurrency.dylib in your app bundle as it doesn't know about the app it was copied from. I find the overall design of SMJobBless questionable, but unfortunately it's the only supported way Apple provides to generically run as root.

I'll just go through and convert my async handlers to use callbacks.

Yeah, unfortunately that does seem to be the only option.

ps crikey, do you ever sleep? 🤪

Oh, I'm not in New Zealand at the moment.

Sorry to prolong this, but I'm confused :( I've just re-read the documentation more carefully and realised it states that both async and closures will only work in the helper in Monterey and later. So converting my async handlers to use callbacks isn't going to help... or am I misunderstanding what's meant by the docs?

My Process handlers will be slow; if I can't use async nor callbacks, should I just use process.waitUntilExit and rely on SecureXPC calling the handlers asynchronously?

I've just played with the SwiftAuthorizationSample code and confirmed that I'm wrong above; without async handlers, SecureXPC calls the handlers synchronously (as one would expect). Multiple calls to slow handlers (eg running a slow process.waitUntilExit) will each wait for the the preceding to complete. Pretty much by definition of being not async.

Is there actually a closure-based alternative to async handlers? Or is the closure-based API only available on the client side?

I'm really hoping I've not misunderstood the implication of this issue. It's pretty critical for my helper to run multiple Processes simultaneously.

Interestingly enough, Xcode already compiles the helper with a dynamic link to @rpath/libswift_Concurrency.dylib, and thus copying the dylib into /Library/PrivilegedHelperTools made the whole problem go away in Big Sur. However I'm doubting there's a clean way to "install" a copy of libswift_Concurrency.dylib somewhere safe at runtime though.

Sorry to prolong this, but I'm confused :( I've just re-read the documentation more carefully and realised it states that both async and closures will only work in the helper in Monterey and later. So converting my async handlers to use callbacks isn't going to help... or am I misunderstanding what's meant by the docs?

Assuming you mean the sentence "On macOS 10.15 and later async functions and closures..." then the async refers to both functions and closures - in other words "async functions" and "async closures". The usage of Swift closures works on all operating systems.

I've just played with the SwiftAuthorizationSample code and confirmed that I'm wrong above; without async handlers, SecureXPC calls the handlers synchronously (as one would expect). Multiple calls to slow handlers (eg running a slow process.waitUntilExit) will each wait for the the preceding to complete. Pretty much by definition of being not async.

Have you set a concurrent DispatchQueue for the server's targetQueue property? The sample keeps things simple and does not do that (nor would it make sense for how the sample behaves).

Interestingly enough, Xcode already compiles the helper with a dynamic link to @rpath/libswift_Concurrency.dylib, and thus copying the dylib into /Library/PrivilegedHelperTools made the whole problem go away in Big Sur. However I'm doubting there's a clean way to "install" a copy of libswift_Concurrency.dylib somewhere safe at runtime though.

Indeed, there is not.

I think the original issue topic here is sorted; the reasons for the crashing are now well-identified. In case it helps anyone else who comes across this, there's a related issue regarding handler concurrency at #92