launchdarkly/ios-client-sdk

Deadlock in v5.4.5

danwood opened this issue · 4 comments

Is this a support request?
no, it's a bug report

Describe the bug
We got a deadlock between the main thread waiting for the com.launchdarkly.DiagnosticCache.cacheQueue thread, and that thread waiting for the main thread.

To reproduce
Not reproducible :-(

Expected behavior
We should not have a deadlock :-)

Logs
Main Thread: (excerpted, this is the part called from our app's code)

                                                  1000  LDClient.variationInternal<A>(forKey:defaultValue:includeReason:) + 2736 (LaunchDarkly + 303776) [0x1057e62a0]
                                                    1000  EventReporter.recordFlagEvaluationEvents(flagKey:value:defaultValue:featureFlag:user:includeReason:) + 992 (LaunchDarkly + 186444) [0x1057c984c]
                                                      1000  _dispatch_sync_f_slow + 144 (libdispatch.dylib + 76944) [0x185999c90]
                                                        1000  __DISPATCH_WAIT_FOR_QUEUE__ + 336 (libdispatch.dylib + 78028) [0x18599a0cc]
                                                          1000  _dispatch_event_loop_wait_for_ownership + 444 (libdispatch.dylib + 161256) [0x1859ae5e8]
                                                            1000  kevent_id + 8 (libsystem_kernel.dylib + 13960) [0x185b0f688]
                                                             *1000  ??? (kernel.release.t6000 + 5642160) [0xfffffe0007e757b0] (blocked by turnstile waiting for #####REDACTED##### [65883] [unique pid 165569] thread 0x27f1bc)

Thread 2xf1bc:

  Thread 0x27f1bc    DispatchQueue "com.launchdarkly.DiagnosticCache.cacheQueue"(562)    1000 samples (1-1000)    priority 46 (base 46)
  1000  start_wqthread + 8 (libsystem_pthread.dylib + 8900) [0x185b442c4]
    1000  _pthread_wqthread + 288 (libsystem_pthread.dylib + 13744) [0x185b455b0]
      1000  _dispatch_workloop_worker_thread + 656 (libdispatch.dylib + 91912) [0x18599d708]
        1000  _dispatch_lane_invoke + 392 (libdispatch.dylib + 48804) [0x185992ea4]
          1000  _dispatch_lane_serial_drain + 672 (libdispatch.dylib + 45872) [0x185992330]
            1000  _dispatch_client_callout + 20 (libdispatch.dylib + 15276) [0x18598abac]
              1000  _dispatch_call_block_and_release + 32 (libdispatch.dylib + 7776) [0x185988e60]
                1000  thunk for @escaping @callee_guaranteed () -> () + 28 (LaunchDarkly + 384280) [0x1057f9d18]
                  1000  closure #1 in LDTimer.timerFired() + 128 (LaunchDarkly + 385392) [0x1057fa170]
                    1000  EventReporter.reportEvents(completion:) + 1220 (LaunchDarkly + 188704) [0x1057ca120]
                      1000  _dispatch_lane_barrier_sync_invoke_and_complete + 56 (libdispatch.dylib + 77312) [0x185999e00]
                        1000  _dispatch_client_callout + 20 (libdispatch.dylib + 15276) [0x18598abac]
                          1000  thunk for @escaping @callee_guaranteed () -> () + 20 (LaunchDarkly + 285612) [0x1057e1bac]
                            1000  thunk for @callee_guaranteed () -> () + 20 (LaunchDarkly + 285580) [0x1057e1b8c]
                              1000  closure #1 in DiagnosticCache.updateStoredDataSync(updateFunc:) + 168 (LaunchDarkly + 106644) [0x1057b6094]
                                1000  StoreData.save(_:) + 252 (LaunchDarkly + 105008) [0x1057b5a30]
                                  1000  -[NSNotificationCenter postNotificationName:object:userInfo:] + 96 (Foundation + 42464) [0x186a915e0]
                                    1000  _CFXNotificationPost + 800 (CoreFoundation + 294496) [0x185bd7e60]
                                      1000  -[NSOperation waitUntilFinished] + 584 (Foundation + 349440) [0x186adc500]
                                        1000  __psynch_cvwait + 8 (libsystem_kernel.dylib + 20672) [0x185b110c0]
                                         *1000  psynch_cvcontinue + 0 (pthread + 18008) [0xfffffe000a508a18]

SDK version
5.4.5

Language version, developer tools
Xcode 13.4.1

OS/platform
Mac OS Monterey 12.4

Additional context
Add any other context about the problem here.

Thank you for the bug report. I haven't had a chance to dig into this yet but I didn't want to leave you over the weekend without some acknowledgement.

I see you are using v5.4.5. Are you in a position to upgrade to the v6 SDK?

Haven't tried that yet … the strange thing is that we are only seeing these issues recently and we've been on 5.4.5 since the end of April. We can investigate moving up to v6 but that's a bigger task than we were hoping to just avoid some deadlocks …

From what I can see, we're trying to access the flag's current value which should be an immediate-return operation. I can see how there would be an async operation to save to the web for statistical purposes that the flag was evaluated. But why would the main thread be waiting for this operation to complete before it could return?

gonna withdraw our report. I wasn't the one to fix it but there was something we were doing in our code that interacted poorly here so this happened.