Hang detection causes app to hang
Closed this issue · 0 comments
Platform
iOS
Environment
Production, Develop
Installed
Swift Package Manager
Version
N/A
Xcode Version
N/A
Did it work on previous versions?
No response
Steps to Reproduce
I've been debugging an app hang and it seems to be caused by deadlock triggered by the Sentry ANR detection. Here is the stack trace I get when sampling the threads in my app during the hang:
+ 7566 Thread_460801: io.sentry.app-hang-tracker
+ 7566 thread_start (in libsystem_pthread.dylib) + 8 [0x10513a6f0]
+ 7566 _pthread_start (in libsystem_pthread.dylib) + 104 [0x10513f4c0]
+ 7566 __NSThread__start__ (in Foundation) + 720 [0x180df3e70]
+ 7566 -[SentryANRTracker detectANRs] (in MyApp) + 928 [0x1116fb804]
+ 7566 -[SentryANRTracker ANRDetected] (in MyApp) + 220 [0x1116fbdf0]
+ 7566 -[SentryANRTrackingIntegration anrDetected] (in MyApp) + 196 [0x111737924]
+ 7566 -[SentryThreadInspector getCurrentThreadsWithStackTrace] (in MyApp) + 232 [0x1116f7d64]
+ 7566 getStackEntriesFromThread (in MyApp) + 108 [0x1116f7718]
+ 7566 symbolicate_internal (in MyApp) + 108 [0x11171a0d8]
+ 7566 sentrycrashdl_dladdr (in MyApp) + 108 [0x11174f70c]
+ 7566 dyld4::APIs::_dyld_get_image_header(unsigned int) (in dyld_sim) + 112 [0x10482f8ec]
+ 7566 dyld4::RuntimeLocks::withLoadersReadLock(void () block_pointer) (in dyld_sim) + 56 [0x104816010]
+ 7566 _os_unfair_lock_lock_slow (in libsystem_platform.dylib) + 204 [0x1051199c4]
+ 7566 __ulock_wait (in libsystem_kernel.dylib) + 8 [0x10509e7dc]
And at the same time a different thread has this stack trace:
+ 7566 _os_log_impl_stream (in libsystem_trace.dylib) + 528 [0x1800a72f0]
+ 7566 _os_activity_stream_reflect (in libsystem_trace.dylib) + 320 [0x180098e94]
+ 7566 dyld4::APIs::dyld_image_path_containing_address(void const*) (in dyld_sim) + 68 [0x104831f9c]
+ 7566 dyld4::APIs::findImageMappedAt(void const*, dyld3::MachOLoaded const**, bool*, char const**, void const**, unsigned long long*, unsigned char*, dyld4::Loader const**) (in dyld_sim) + 544 [0x104831ce0]
+ 7566 dyld4::RuntimeLocks::withLoadersReadLock(void () block_pointer) (in dyld_sim) + 56 [0x104816010]
+ 7566 _os_unfair_lock_lock_slow (in libsystem_platform.dylib) + 204 [0x1051199c4]
+ 7566 __ulock_wait (in libsystem_kernel.dylib) + 8 [0x10509e7dc]
This looks like a typical deadlock from calling code that is not async-signal-safe while all the threads are suspended. I see in the first trace that sentrycrashdl_dladdr
is acquiring a lock even though the header comment for this function is "async-safe version of dladdr." (https://github.com/getsentry/sentry-cocoa/blob/d9280eec4f311096709ba84fc6a5d04423503c6c/Sources/SentryCrash/Recording/Tools/SentryCrashDynamicLinker.h#L102C5-L102C34) It looks like this isn't the case and the function is not safe to call, because it uses _dyld_get_image_header
which is trying to acquire a lock.
Expected Result
The app does not deadlock and all code called in the ANR handler is async-signal-safe
Actual Result
The app hangs
Are you willing to submit a PR?
No response