fetchAuthSession fails after device wakes
kparichan opened this issue · 18 comments
Describe the bug
In our analytic reports we see fetchAuthSession failing after a device wakes up, and has been asleep for some time.
Additional info provided the "Additional context" section below.
This is not a new issue for us. It's been showing up intermittently for the last several months, across various versions of amplify-swift.
Steps To Reproduce
There's no clear way to reproduce this. Just sleeping the device and waking it later with the app running does not reproduce this issue. It's something that we almost always notice in the field.
Expected behavior
Network connectivity is available, so fetchAuthSession should return a valid auth token or flag the user as not being signed in.
Amplify Framework Version
2.35.4
Amplify Categories
Auth
Dependency manager
Swift PM
Swift version
5.10
CLI version
12.1.0
Xcode version
15.4 (15F31d)
Relevant log output
No logs are available.
Is this a regression?
No
Regression additional context
No response
Platforms
iOS
OS Version
No specific OS version
Device
No specific device
Specific to simulators
No response
Additional context
Information originally included in response to issue #3540.
We're also seeing behavior in our app which seems to line up with this issues. I don't have full logs unfortunately, only what I've been able to piece together via analytics.
We have the same testers on iOS and Android, and the code is the same across platforms, but we've only ever seen this issue on iOS.
The issue appears to happen when calling fetchAuthSession right after the device wakes up, and has been asleep for some time.
While that call fails attempting to fetch/refresh the auth token, and with Amplify.Hub.subscribe setup, AuthChannelEventName.SESSION_EXPIRED does not appear to be triggered.
The failure to fetchAuthSession reports:
localizedDescription = "The operation couldn’t be completed. (Amplify.AuthError error 1.)"
print() = "network(AuthError: Unknown service error occurred Recovery suggestion: See the attached error for mo" (Truncated by analytics)The only other info I have is a screenshot from a debug build which reports an error of "The network connection was lost." It was attempting to contact a url of "https://cognito-idp.us-west-2.amazonaws.com/".
I don't know how the internal networking is handled, but is the framework attempting to preflight network connectivity before actually making a request?
In general, that goes against the recommended behavior and could lead to an issue where connectivity checking fails because the device radios haven't been powered up yet right after wake.
This may also relate to another closed issue that my coworker pointed out:
#3437Normally you'd just set waitsForConnectivity and a short timeoutIntervalForResource value for the URLSessionConfiguration and issue any calls without any preflight connectivity checking.
Apologies if this has been investigated previously.
Would you be able to share you code snippet that is being used upon device getting awaked. Additionally, could you also share the metrics that you are seeing, if there is any specific logs or errors that could help narrow down the issue.
Inside of a Task, it's basically just...
do {
let authSession = try await Amplify.Auth.fetchAuthSession()
// ... handle success
} catch {
Debug.log("Fetch failed", error)
// ... report failure
}
The error returned is what I mentioned in the context section above.
There are no logs or analytics that I can share. The error is logged via analytics, which is how I got those same results posted above.
And as I mentioned also, in this case, HubPayload.EventName.Auth.sessionExpired is NOT reported.
We also have this issue. However, I have not yet found out the exact cause and cannot reproduce it.
Thank you - it'd be great if you can share reproduction steps when you run into this issue again.
Today I had the problem again on one of our test devices.
After the app had been in the background overnight and I had then opened it.
Here are the logs from the console:
11:09:09.316133+0200 BeTidy Connection 9: received H2 RST_STREAM with error code: 1
11:09:09.316670+0200 BeTidy Task .<1> done using Connection 9
11:09:09.316787+0200 BeTidy [C9] event: client:connection_idle @60.558s
11:09:09.317085+0200 BeTidy Task .<1> can retry(N) with reason(2) for error [1:57]
11:09:09.317146+0200 BeTidy Task .<1> HTTP load failed, 247/0 bytes (error code: -1005 [1:57])
11:09:09.318141+0200 BeTidy Task .<1> summary for task failure {transaction_duration_ms=60559, response_status=200, connection=9, protocol="h2", domain_lookup_duration_ms=3, connect_duration_ms=513, secure_connection_duration_ms=120, private_relay=false, request_start_ms=522, request_duration_ms=0, response_start_ms=0, response_duration_ms=0, request_bytes=247, response_bytes=0, cache_hit=true}
11:09:09.318429+0200 BeTidy Task .<1> finished with error [-1005] Error Domain=NSURLErrorDomain Code=-1005 UserInfo={_kCFStreamErrorCodeKey=57, NSUnderlyingError=0x302a9c6c0 {Error Domain=kCFErrorDomainCFNetwork Code=-1005 UserInfo={NSErrorPeerAddressKey=, _kCFStreamErrorCodeKey=57, _kCFStreamErrorDomainKey=1}}, _NSURLErrorFailingURLSessionTaskErrorKey=, _NSURLErrorRelatedURLSessionTaskErrorKey=, NSLocalizedDescription=, NSErrorFailingURLStringKey=, NSErrorFailingURLKey=, _kCFStreamErrorDomainKey=1}
11:09:09.321650+0200 BeTidy Connection 9: cleaning up
11:09:09.321711+0200 BeTidy [C9 BC51CC96-4275-423D-88CB-A1CD50CE25D5 Hostname#11115440:443 quic-connection, url hash: 341d5f92, definite, attribution: developer] cancel
11:09:09.321880+0200 BeTidy [C9 BC51CC96-4275-423D-88CB-A1CD50CE25D5 Hostname#11115440:443 quic-connection, url hash: 341d5f92, definite, attribution: developer] cancelled
[C9.1.1.1 AD999B24-AF29-4143-8406-3F54F0743EEA 1111:11:1111:dd13:cdc6:a816:119c:1e14.49368<->IPv6#1111e982.443]
Connected Path: satisfied (Path is satisfied), viable, interface: pdp_ip0[endc_sub6], ipv4, ipv6, dns, expensive, uses cell
Privacy Stance: Not Eligible
Duration: 60.563s, DNS @0.001s took 0.003s, TCP @0.006s took 0.391s, TLS 1.3 took 0.120s
bytes in/out: 5794/984, packets in/out: 7/16, rtt: 0.060s, retransmitted bytes: 0, out-of-order bytes: 0
ecn packets sent/acked/marked/lost: 0/0/0/0
11:09:09.322395+0200 BeTidy nw_protocol_tcp_log_summary [C9.1.1.1:3]
[BA495FB6-CB13-438D-A8CF-972F51C5552A :49368<->:443]
Init: 1, Conn_Time: 391.349ms, SYNs: 4, WR_T: 0/0, RD_T: 0/0, TFO: 0/0/0, ECN: 0/0/0, Accurate ECN (client/server): Disabled/Disabled, TS: 1, TSO: 0
rtt_cache: process, rtt_upd: 4, rtt: 60.375ms, rtt_var: 37.437ms rtt_nc: 130.531ms, rtt_var_nc: 81.625ms base rtt: 23ms
ACKs-compressed: 0, ACKs delayed: 0 delayed ACKs sent: 0
11:09:09.322555+0200 BeTidy nw_flow_disconnected [C9 IPv6#1111e982.443 cancelled parent-flow ((null))] Output protocol disconnected
11:09:09.322695+0200 BeTidy Fetch identityId failed with error: Unexpected error occurred with message: Unknown error occurred
Recovery suggestion: This should not happen. There is a possibility that there is a bug if this error persists. Please take a look at https://github.com/aws-amplify/amplify-swift/issues to see if there are any existing issues that match your scenario, and file an issue with the details of the bug if there isn't.
Our code at this point looks like this:
do {
let session = try await Amplify.Auth.fetchAuthSession()
if let identityProvider = session as? AuthCognitoIdentityProvider {
let identityIdResult = identityProvider.getIdentityId()
switch identityIdResult {
case .success(let identityId):
// ... handle identityId
case .failure(let error):
// ... here we get the error
NSLog("Fetch identityId failed with error: \(error)")
throw BTError.runtimeError(self._getErrorMessage(pError: error))
}
}
} catch {
// ... handle error
}
One more piece of information.
We have added a retry dialog for this case, where we call Amplify.Auth.fetchAuthSession with forceRefresh and hoped that this would get the user out of the situation. But unfortunately that doesn't help either. The user has to kill the app completely and restart it to be able to use it normally again.
@tobias-feldmann Do you think you would be able to get verbose logs for the scenario that you faced? You can enable verbose logging to the console by doing this before calling Amplify.configure
:
Amplify.Logging.logLevel = .verbose
@harsh62
I will give it a try. However, I can only very rarely reproduce the problem on our test devices. So it could take a while.