getsentry/sentry-cocoa

Hang detection causes app to hang

Closed this issue · 0 comments

Platform

iOS

Environment

Production, Develop

Installed

Swift Package Manager

Version

N/A

Xcode Version

N/A

Did it work on previous versions?

No response

Steps to Reproduce

I've been debugging an app hang and it seems to be caused by deadlock triggered by the Sentry ANR detection. Here is the stack trace I get when sampling the threads in my app during the hang:

+ 7566 Thread_460801: io.sentry.app-hang-tracker
+ 7566 thread_start  (in libsystem_pthread.dylib) + 8  [0x10513a6f0]
+ 7566 _pthread_start  (in libsystem_pthread.dylib) + 104  [0x10513f4c0]
+ 7566 __NSThread__start__  (in Foundation) + 720  [0x180df3e70]
+ 7566 -[SentryANRTracker detectANRs]  (in MyApp) + 928  [0x1116fb804]
+ 7566 -[SentryANRTracker ANRDetected]  (in MyApp) + 220  [0x1116fbdf0]
+ 7566 -[SentryANRTrackingIntegration anrDetected]  (in MyApp) + 196  [0x111737924]
+ 7566 -[SentryThreadInspector getCurrentThreadsWithStackTrace]  (in MyApp) + 232  [0x1116f7d64]
+ 7566 getStackEntriesFromThread  (in MyApp) + 108  [0x1116f7718]
+ 7566 symbolicate_internal  (in MyApp) + 108  [0x11171a0d8]
+ 7566 sentrycrashdl_dladdr  (in MyApp) + 108  [0x11174f70c]
+ 7566 dyld4::APIs::_dyld_get_image_header(unsigned int)  (in dyld_sim) + 112  [0x10482f8ec]
+ 7566 dyld4::RuntimeLocks::withLoadersReadLock(void () block_pointer)  (in dyld_sim) + 56  [0x104816010]
+ 7566 _os_unfair_lock_lock_slow  (in libsystem_platform.dylib) + 204  [0x1051199c4]
+ 7566 __ulock_wait  (in libsystem_kernel.dylib) + 8  [0x10509e7dc]

And at the same time a different thread has this stack trace:

+ 7566 _os_log_impl_stream  (in libsystem_trace.dylib) + 528  [0x1800a72f0]
+ 7566 _os_activity_stream_reflect  (in libsystem_trace.dylib) + 320  [0x180098e94]
+ 7566 dyld4::APIs::dyld_image_path_containing_address(void const*)  (in dyld_sim) + 68  [0x104831f9c]
+ 7566 dyld4::APIs::findImageMappedAt(void const*, dyld3::MachOLoaded const**, bool*, char const**, void const**, unsigned long long*, unsigned char*, dyld4::Loader const**)  (in dyld_sim) + 544  [0x104831ce0]
+ 7566 dyld4::RuntimeLocks::withLoadersReadLock(void () block_pointer)  (in dyld_sim) + 56  [0x104816010]
+ 7566 _os_unfair_lock_lock_slow  (in libsystem_platform.dylib) + 204  [0x1051199c4]
+ 7566 __ulock_wait  (in libsystem_kernel.dylib) + 8  [0x10509e7dc]

This looks like a typical deadlock from calling code that is not async-signal-safe while all the threads are suspended. I see in the first trace that sentrycrashdl_dladdr is acquiring a lock even though the header comment for this function is "async-safe version of dladdr." (https://github.com/getsentry/sentry-cocoa/blob/d9280eec4f311096709ba84fc6a5d04423503c6c/Sources/SentryCrash/Recording/Tools/SentryCrashDynamicLinker.h#L102C5-L102C34) It looks like this isn't the case and the function is not safe to call, because it uses _dyld_get_image_header which is trying to acquire a lock.

Expected Result

The app does not deadlock and all code called in the ANR handler is async-signal-safe

Actual Result

The app hangs

Are you willing to submit a PR?

No response