aws-amplify/aws-sdk-ios

Crash on [AWSMQTTSession publishDataAtLeastOnce:onTopic:retain:onMessageIdResolved:] function

09chen1998 opened this issue · 18 comments

Describe the bug
Our app occasionally crashes, some of which occur after calling subscribeToTopic, some after calling publishData, and some during other operations.

To Reproduce
Steps to reproduce the behavior:
There is no fixed reproducibility procedure, and crashes occur at irregular intervals.

Code Snippet
Please provide a snippet of the code causing the issue or how you are using the service that has the bug.

Unique Configuration
If you are reporting an issue with a unique configuration or where configuration can make a difference in code execution (i.e. Cognito) please provide your configuration. Please make sure to obfuscate sensitive information from the configuration before posting.

Areas of the SDK you are using (AWSMobileClient, Cognito, Pinpoint, IoT, etc)?

Screenshots
image

Environment(please complete the following information):

  • SDK Version: [2.37.0]
  • Dependency Manager: [Cocoapods]
  • Swift Version : [5.9.2]
  • Xcode Version: [15.2 ]

Device Information (please complete the following information):

  • Device: [iPhone8, iPhone13, iPhone13 pro max]
  • iOS Version: [IOS 16.5, IOS 17.5.1, IOS 17.6.1]

Logs
image

``` ```

Could it be caused by multiple threads accessing the same resource?

Thanks for your crash report, one of our team members will do some investigation on this as soon as possible

@vincetran that's newest screenshot when app crash:
image

@09chen1998 Instead of screenshots, can you provide a full crash log that includes the crashing thread along with all of the other running threads?

@phantumcode The complete crash log is the screenshot of the log I provided above. The latest screenshot I sent is not a crash in the production/test environment, so there is no complete log left. However, judging from the code execution sequence, it is caused by the same problem as the log under "Logs" I provided.

@09chen1998 A complete crashlog would include the stack trace of the thread with the crash as well as all of the running threads at the time of the crash. Having a full crash log is helpful in determining crashes caused by threading issues.

@phantumcode Here is the complete crash log.

SIGSEGV
SEGV_ACCERR

0
CoreFoundation
-[__NSSetM addObject:] + 356

1
CoreFoundation
-[__NSSetM addObject:] + 332
2
AWSIoT
-[AWSMQTTSession publishDataAtLeastOnce:onTopic:retain:onMessageIdResolved:] + 716
3
AWSIoT
-[AWSIoTMQTTClient publishData:qos:onTopic:retain:ackCallback:] + 1144
4
AWSIoT
-[AWSIoTDataManager publishData:onTopic:QoS:retain:ackCallback:] + 388
5
AWSIoT
-[AWSIoTDataManager publishData:onTopic:QoS:ackCallback:] + 148
6
AWSIoT
-[AWSIoTDataManager publishData:onTopic:QoS:] + 116
7
JBELink
$s7JBELink18JLSubscribeManagerC18publishDataToTopic5topic6params8baseInfo10completion8progressySSSg_SaySDySSypSgGGSDySSypGSgySb_AA011JLSubDeviceK0VSgAKtcSgySScSgtFySbcfU_ + 916
8
JBELink
$s7JBELink18JLSubscribeManagerC18publishDataToTopic5topic6params8baseInfo10completion8progressySSSg_SaySDySSypSgGGSDySSypGSgySb_AA011JLSubDeviceK0VSgAKtcSgySScSgtFySbcfU_TA + 68
9
JBELink
$s7JBELink18JLSubscribeManagerC15configAwsStatus33_7DEEC69F00E0BBD387F70BD528265A31LL10completionyySbc_tF + 404
10
JBELink
$s7JBELink18JLSubscribeManagerC18publishDataToTopic5topic6params8baseInfo10completion8progressySSSg_SaySDySSypSgGGSDySSypGSgySb_AA011JLSubDeviceK0VSgAKtcSgySScSgtF + 364
11
JBELink
$s7JBELink18JLSubscribeManagerC11publishData8deviceId6params8baseInfo10completionySSSg_SaySDySSypGGAJSgySb_AA011JLSubDeviceJ0VSgypSgtcSgtF + 368
12
JBELink
$s7JBELink18JLSubscribeManagerC11publishData8deviceId6params8baseInfoySSSg_SaySDySSypGGAISgtF + 96
13
JBELink
$s7JBELink18JLSubscribeManagerC11publishData8deviceId6paramsySSSg_SaySDySSypGGtF + 84
14
JBELink
$s7JBELink13JLDataManagerC14getGatewayInfo33_7FA623ED952DD5A5136553B2FA6EDFAALL11wifiModelId3macySSSg_AHtFyycfU_ + 2424
15
JBELink
$s7JBELink18JLSubscribeManagerC16subscribeToTopic8topicKey0G010completionySS_SSSgyyctFyycfU0_ + 268
16
JBELink
$sIeg_IeyB_TR + 48
17
AWSIoT
__47-[AWSIoTMQTTClient session:newAckForMessageId:]_block_invoke + 40
18
libdispatch.dylib
__dispatch_call_block_and_release + 32

19
libdispatch.dylib
__dispatch_client_callout + 20
20
libdispatch.dylib
__dispatch_root_queue_drain + 864
21
libdispatch.dylib
__dispatch_worker_thread2 + 156
22
libsystem_pthread.dylib
__pthread_wqthread + 228

#0 Thread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
GraphicsServices
GSEventRunModal + 164
8
UIKitCore
0x000000019c5df000 + 4238056
9
UIKitCore
UIApplicationMain + 340
10
UIKitCore
0x000000019c5df000 + 6522116
11
JBELink
$sSo21UIApplicationDelegateP5UIKitE4mainyyFZ + 120
12
JBELink
$s7JBELink11AppDelegateC5$mainyyFZ + 44
13
JBELink
main + 28
14
dyld
0x00000001bdb4a000 + 250196


#1 com.apple.uikit.eventfetch-thread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
Foundation
0x0000000199209000 + 818012
8
Foundation
0x0000000199209000 + 817580
9
UIKitCore
0x000000019c5df000 + 4319260
10
Foundation
0x0000000199209000 + 910376
11
libsystem_pthread.dylib
_pthread_start + 136

#2 Thread
0
libsystem_kernel.dylib
__psynch_cvwait + 8

1
libsystem_pthread.dylib
0x00000001f6e2c000 + 14052
2
JBELink
_ZN6XBASIC7CXEvent19WaitForSingleObjectEi + 204
3
JBELink
_ZN6XBASIC10CRunDriver7RunWorkEPNS_7CXEventEi + 172
4
JBELink
_ZN6XBASIC10CRunDriver12ThreadRunFunEPv + 32
5
libsystem_pthread.dylib
_pthread_start + 136


#3 Thread
0
libsystem_kernel.dylib
__psynch_cvwait + 8

1
libsystem_pthread.dylib
0x00000001f6e2c000 + 14052
2
JBELink
_ZN6XBASIC7CXEvent19WaitForSingleObjectEi + 204
3
JBELink
_ZN6XBASIC10CRunDriver7RunWorkEPNS_7CXEventEi + 172
4
JBELink
_ZN6XBASIC10CRunDriver12ThreadRunFunEPv + 32
5
libsystem_pthread.dylib
_pthread_start + 136


#4 Thread
0
libsystem_kernel.dylib
__psynch_cvwait + 8

1
libsystem_pthread.dylib
0x00000001f6e2c000 + 14052
2
JBELink
_ZN6XBASIC7CXEvent19WaitForSingleObjectEi + 204
3
JBELink
_ZN6XBASIC10CRunDriver7RunWorkEPNS_7CXEventEi + 172
4
JBELink
_ZN6XBASIC10CRunDriver12ThreadRunFunEPv + 32
5
libsystem_pthread.dylib
_pthread_start + 136


#5 Thread
0
libsystem_kernel.dylib
__semwait_signal + 8

1
libsystem_c.dylib
nanosleep + 220
2
JBELink
_ZN6XBASIC7CXTimer5OnRunEv + 1604
3
JBELink
_ZN6XBASIC7CXTimer12ThreadRunFunEPv + 12
4
libsystem_pthread.dylib
_pthread_start + 136


#6 Thread
0
libsystem_kernel.dylib
__semwait_signal + 8

1
libsystem_c.dylib
nanosleep + 220
2
JBELink
_ZN15CNetSelectWoker5OnRunEv + 3640
3
JBELink
_ZN15CNetSelectWoker13FunThreadWorkEPv + 12
4
libsystem_pthread.dylib
_pthread_start + 136


#7 Realm notification listener
0
libsystem_kernel.dylib
kevent + 8

1
Realm
_ZN5realm5_impl20ExternalCommitHelper6listenEv + 156
2
Realm
_ZNSt3__114__thread_proxyB7v160006INS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEEZN5realm5_impl20ExternalCommitHelperC1ERNS8_16RealmCoordinatorERKNS7_11RealmConfigEE3$_0EEEEEPvSH_ + 56
3
libsystem_pthread.dylib
_pthread_start + 136


#8 Thread
0
libsystem_kernel.dylib
__psynch_cvwait + 8

1
libsystem_pthread.dylib
0x00000001f6e2c000 + 14052
2
JBELink
_ZN6XBASIC7CXEvent19WaitForSingleObjectEi + 176
3
JBELink
_ZN6XBASIC13CXTCPSelector5OnRunEv + 1300
4
JBELink
_ZN6XBASIC13CXTCPSelector13FunThreadWorkEPv + 12
5
libsystem_pthread.dylib
_pthread_start + 136


#9 com.apple.NSURLConnectionLoader
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
CFNetwork
_CFHostIsDomainTopLevel + 108176
8
Foundation
0x0000000199209000 + 910376
9
libsystem_pthread.dylib
_pthread_start + 136

#10 Thread
0
libsystem_kernel.dylib
__psynch_cvwait + 8

1
libsystem_pthread.dylib
0x00000001f6e2c000 + 14052
2
libc++.1.dylib
_ZNSt3__118condition_variable4waitERNS_11unique_lockINS_5mutexEEE + 28
3
Realm
_ZN5realm2DB17AsyncCommitHelper4mainEv + 116
4
Realm
_ZNSt3__114__thread_proxyB7v160006INS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEEZN5realm2DB17AsyncCommitHelper12start_threadEvEUlvE_EEEEEPvSC_ + 52
5
libsystem_pthread.dylib
_pthread_start + 136


#11 com.squareup.SocketRocket.NetworkThread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
Foundation
0x0000000199209000 + 818012
8
AWSIoT
-[_SRRunLoopThread main] + 284
9
Foundation
0x0000000199209000 + 910376

10
libsystem_pthread.dylib
_pthread_start + 136

#12 com.apple.CFSocket.private
0
libsystem_kernel.dylib
__select + 8

1
CoreFoundation
0x000000019a35d000 + 781180
2
libsystem_pthread.dylib
_pthread_start + 136

#13 Thread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
Foundation
0x0000000199209000 + 818012
8
AWSIoT
-[AWSIoTStreamThread main] + 1032
9
Foundation
0x0000000199209000 + 910376

10
libsystem_pthread.dylib
_pthread_start + 136

#14 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#15 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#16 Thread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
Foundation
0x0000000199209000 + 818012
8
AWSIoT
-[AWSIoTMQTTClient scheduleReconnection] + 516
9
AWSIoT
__43-[AWSIoTMQTTClient initiateReconnectTimer:]_block_invoke + 40
10
libdispatch.dylib
0x00000001a2283000 + 8508

11
libdispatch.dylib
0x00000001a2283000 + 15828
12
libdispatch.dylib
0x00000001a2283000 + 46080
13
libdispatch.dylib
0x00000001a2283000 + 48944
14
libdispatch.dylib
0x00000001a2283000 + 93364
15
libdispatch.dylib
0x00000001a2283000 + 91432
16
libsystem_pthread.dylib
_pthread_wqthread + 288

#17 Thread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
Foundation
0x0000000199209000 + 818012
8
AWSIoT
-[AWSIoTStreamThread main] + 1032
9
Foundation
0x0000000199209000 + 910376

10
libsystem_pthread.dylib
_pthread_start + 136

#18 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#19 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#20 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#21 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#22 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#23 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#24 Thread
0
libsystem_kernel.dylib
__workq_kernreturn + 8

1
libsystem_pthread.dylib
_pthread_wqthread + 364

#25 Thread
0
CoreFoundation
0x000000019a35d000 + 187404

1
CoreFoundation
0x000000019a35d000 + 187380
2
AWSIoT
-[AWSMQTTSession publishDataAtLeastOnce:onTopic:retain:onMessageIdResolved:] + 716
3
AWSIoT
-[AWSIoTMQTTClient publishData:qos:onTopic:retain:ackCallback:] + 1144
4
AWSIoT
-[AWSIoTDataManager publishData:onTopic:QoS:retain:ackCallback:] + 388
5
AWSIoT
-[AWSIoTDataManager publishData:onTopic:QoS:ackCallback:] + 148
6
AWSIoT
-[AWSIoTDataManager publishData:onTopic:QoS:] + 116
7
JBELink
$s7JBELink18JLSubscribeManagerC18publishDataToTopic5topic6params8baseInfo10completion8progressySSSg_SaySDySSypSgGGSDySSypGSgySb_AA011JLSubDeviceK0VSgAKtcSgySScSgtFySbcfU_ + 916
8
JBELink
$s7JBELink18JLSubscribeManagerC18publishDataToTopic5topic6params8baseInfo10completion8progressySSSg_SaySDySSypSgGGSDySSypGSgySb_AA011JLSubDeviceK0VSgAKtcSgySScSgtFySbcfU_TA + 68
9
JBELink
$s7JBELink18JLSubscribeManagerC15configAwsStatus33_7DEEC69F00E0BBD387F70BD528265A31LL10completionyySbc_tF + 404
10
JBELink
$s7JBELink18JLSubscribeManagerC18publishDataToTopic5topic6params8baseInfo10completion8progressySSSg_SaySDySSypSgGGSDySSypGSgySb_AA011JLSubDeviceK0VSgAKtcSgySScSgtF + 364
11
JBELink
$s7JBELink18JLSubscribeManagerC11publishData8deviceId6params8baseInfo10completionySSSg_SaySDySSypGGAJSgySb_AA011JLSubDeviceJ0VSgypSgtcSgtF + 368
12
JBELink
$s7JBELink18JLSubscribeManagerC11publishData8deviceId6params8baseInfoySSSg_SaySDySSypGGAISgtF + 96
13
JBELink
$s7JBELink18JLSubscribeManagerC11publishData8deviceId6paramsySSSg_SaySDySSypGGtF + 84
14
JBELink
$s7JBELink13JLDataManagerC14getGatewayInfo33_7FA623ED952DD5A5136553B2FA6EDFAALL11wifiModelId3macySSSg_AHtFyycfU_ + 2424
15
JBELink
$s7JBELink18JLSubscribeManagerC16subscribeToTopic8topicKey0G010completionySS_SSSgyyctFyycfU0_ + 268
16
JBELink
$sIeg_IeyB_TR + 48
17
AWSIoT
__47-[AWSIoTMQTTClient session:newAckForMessageId:]_block_invoke + 40
18
libdispatch.dylib
0x00000001a2283000 + 8508

19
libdispatch.dylib
0x00000001a2283000 + 15828
20
libdispatch.dylib
0x00000001a2283000 + 88684
21
libdispatch.dylib
0x00000001a2283000 + 90268
22
libsystem_pthread.dylib
_pthread_wqthread + 228

#26 Thread
0
libsystem_kernel.dylib
mach_msg2_trap + 8

1
libsystem_kernel.dylib
mach_msg2_internal + 80
2
libsystem_kernel.dylib
mach_msg_overwrite + 436
3
libsystem_kernel.dylib
mach_msg + 24
4
CoreFoundation
0x000000019a35d000 + 343900
5
CoreFoundation
0x000000019a35d000 + 341504
6
CoreFoundation
CFRunLoopRunSpecific + 608
7
Foundation
0x0000000199209000 + 818012
8
AWSIoT
-[AWSIoTStreamThread main] + 1032
9
Foundation
0x0000000199209000 + 910376

10
libsystem_pthread.dylib
_pthread_start + 136

#27 Thread
0
libsystem_kernel.dylib
__select + 8

1
JBELink
_ZN6XBASIC11SKT_ConnectEPK8addrinfoii + 428
2
JBELink
_ZN6XBASIC26CheckAddrInfoAndSKTConnectEPKcS1_iii + 296
3
JBELink
_ZN6XBASIC11SKT_ConnectEPKciii10OBJ_HANDLE + 220
4
JBELink
_ZN10CXHttpsNet19SslCreateConnectionEPKcii + 168
5
JBELink
_ZN10CXHttpsNet7ConnectEPKcii + 136
6
JBELink
_ZN8CSMPHttp4TalkEP13CHttpProtocoliiPP5XData + 376
7
JBELink
_ZN10XMCloudAPI8IXMCloud13GetDevsStatusEPKciiPNS_12SDevAuthInfoEPPNS_10SDevStatusEiiNS_13_E_CHECK_TYPEE + 884
8
JBELink
_Z17PQueryStateNormalPv + 1196
9
libsystem_pthread.dylib
_pthread_start + 136


#28 Thread
0
libsystem_kernel.dylib
__semwait_signal + 8

1
libsystem_c.dylib
nanosleep + 220
2
JBELink
_Z17PQueryStateP2P_V0Pv + 352
3
libsystem_pthread.dylib
_pthread_start + 136


#29 Thread


#30 Thread


#31 Thread
0
libsystem_kernel.dylib
__semwait_signal + 8

1
libsystem_c.dylib
nanosleep + 220
2
JBELink
_Z17PQueryStateP2P_V0Pv + 352
3
libsystem_pthread.dylib
_pthread_start + 136


#32 Thread


#33 Thread


#34 Thread
0
libsystem_kernel.dylib
__semwait_signal + 8

1
libsystem_c.dylib
nanosleep + 220
2
JBELink
_Z17PQueryStateP2P_V0Pv + 352
3
libsystem_pthread.dylib
_pthread_start + 136


#35 Thread


#36 Thread
0
libsystem_kernel.dylib
__semwait_signal + 8

1
libsystem_c.dylib
nanosleep + 220
2
JBELink
_ZN11CDeviceBase13SearchDevicesERNSt3__14listI24SDK_CONFIG_NET_COMMON_V2NS0_9allocatorIS2_EEEERN6XBASIC9CKeyValueEPKcSB_ii + 1372
3
JBELink
_ZN11CDeviceBase15SearchDevicesExERNSt3__14listI24SDK_CONFIG_NET_COMMON_V2NS0_9allocatorIS2_EEEEPKcii + 88
4
JBELink
_ZN11CDataCenter13SearchDevicesEPv + 36
5
libsystem_pthread.dylib
_pthread_start + 136


#37 Thread

Thanks for providing the details. We will investigate this further and get back to you.

@phantumcode @harsh62 @vincetran Hello? May I ask who can give me some advice?

@09chen1998 Unfortunately we haven't had to bandwidth to look into this specific issue yet. Once we do and have made some headway, we'll update the issue accordingly.

@vincetran Okay, thank you for your reply.

@09chen1998 sorry for the delay. Unfortunately you have still not provided a full symbolicated crashlog, as what you pasted doesn't have the information we need to fully diagnose the crash.

You can get instructions on how to retrieve a full crashlog here: https://developer.apple.com/documentation/xcode/acquiring-crash-reports-and-diagnostic-logs.

While it's possible that what you're experiencing is a race condition as you suggested and we'll work on that, we can't know for sure without the proper logs. There might be something else going on.

In the meantime, I've pushed a fix for the potential race condition to the fix/timerring_crash branch.

Would you be able to try it out and let us know if it addresses your issues?

@ruisebas Sorry, the crash logs I provided earlier may not be complete enough. I have provided three new complete crash logs, of which 'JBELink-2024-09-03-175837. docx' and 'JBELink-2024-09-05-100828. docx' correspond to the complete versions of the crash logs I provided earlier, while 'JBELink-2024-09-25-172632. docx' corresponds to the screenshot I provided earlier with the text 'sometimes crash at here'; Both of these situations are occasional occurrences of crashes.
JBELink-2024-09-03-175837.docx
JBELink-2024-09-05-100828.docx
JBELink-2024-09-25-172632.docx

At the same time, I will try using the fix/timerring_crash branch, which may take some time to verify its effectiveness.