DownloadFile failing repeatedly on Darwin_Build_Release_x64 despite the file being there
lewing opened this issue ยท 14 comments
Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=444484
Build error leg or test failing: Build / Darwin_Build_Release_x64 / Build
Pull request: #17482
Error Message
Fill the error message using step by step known issues guidance.
{
"ErrorMessage": "error : Download from all targets failed. List of attempted targets: https://dotnetcli.blob.core.windows.net/dotnet/",
"ErrorPattern": "",
"BuildRetry": true,
"ExcludeConsoleLog": false
}
Known issue validation
Build: ๐
Result validation:
Validation performed at: 11/9/2023 12:50:56 AM UTC
Report
Summary
24-Hour Hit Count | 7-Day Hit Count | 1-Month Count |
---|---|---|
0 | 0 | 0 |
cc @dotnet/domestic-cat
Looking at the binlog:
Retrying download of 'https://dotnetbuilds.blob.core.windows.net/public/Runtime/9.0.0-alpha.1.23512.2/dotnet-runtime-9.0.0-alpha.1.23512.2-osx-x64.tar.gz' to '/Users/runner/work/1/s/artifacts/obj/redist/Release/downloads/dotnet-runtime-9.0.0-alpha.1.23512.2-osx-x64.tar.gz' due to failure: 'The SSL connection could not be established, see inner exception.' (3/3)
It attempts 4 times with that error each time
Looks a lot like dotnet/runtime#1979 cc @wfurt
relevant code is here https://github.com/dotnet/arcade/blob/c287c528f09803900418e4a1e399c807b020cbc7/src/Microsoft.DotNet.Arcade.Sdk/src/DownloadFile.cs#L173-L188 if more logging would be useful
Why do you think TLS 1.3 would matter @lewing ? AFAIK it is not strictly enforced by our infrastructure and TLS 1.2 should still work fine. Did you look what is the inner exception?
Why do you think TLS 1.3 would matter @lewing ? AFAIK it is not strictly enforced by our infrastructure and TLS 1.2 should still work fine. Did you look what is the inner exception?
Honestly I just need someone on the networking side to look into it and was searching issues. It started happening recently on the macs and is breaking a considerably number of builds. We can't easily get more information into the binlog without changing arcade and flowing the change here. cc @mmitche
Note that on Windows TLS 1.3 is supported on on Windows 11 & Server 2022. So if 1.3 is culprit it would also fail on all older Windows.
And according to https://www.ssllabs.com/ssltest/analyze.html?d=dotnetbuilds.blob.core.windows.net
the server even does not support TLS 1.3 (only 1.0, 1.1 & 1.2)
This appears to be a mac specific problem and that queue is quite overloaded, are there timeouts in the handshake that we might be hitting?
Seems like infrastructure problem to me. We should get at least some info about the inner exception. It is going to be very difficult to speculate otherwise. For example, it may be just wrong time
Just the message or more details on the inner exception?
whole thing may be best IMHO as it would also show any possible error codes -> I'm not sure if we capture that in the message text.
Could this have been as simple as a redirect?
I'm not sure why it would impact only MacOS. We should be able to follow redirects and there is limit how many we would process.
calling this closed