CI flakiness issues
luispadron opened this issue · 8 comments
Summary
Over the last few weeks I've noticed an increase in flakiness when running PR tests. Typically the tests pass again after being re-run. We should look into why these tests are flaky and fix it if possible
Issue 1
# Host config
bazelisk test --local_test_jobs=1 -- //... -//tests/ios/...
# `deleted_packages` is needed below in order to override the value of the .bazelrc file
bazelisk test --local_test_jobs=1 --apple_platform_type=ios --deleted_packages='' -- //tests/ios/...
shell: /bin/bash -e {0}
2022/08/17 17:[3](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:3)0:30 Downloading https://releases.bazel.build/5.0.0/release/bazel-5.0.0-darwin-x86_6[4](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:4)...
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
Loading:
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 26 packages loaded
currently loading: tests/ios/frameworks/dynamic/c ... (2 packages)
Loading: 34 packages loaded
currently loading: tests/ios/lldb/app ... (2 packages)
Loading: 70 packages loaded
currently loading: tests/ios/lldb/app ... (4 packages)
Loading: 70 packages loaded
currently loading: tests/ios/lldb/app ... (4 packages)
Analyzing: 194 targets (74 packages loaded, 0 targets configured)
Analyzing: 194 targets (87 packages loaded, 317 targets configured)
Analyzing: 194 targets (88 packages loaded, 317 targets configured)
Analyzing: 194 targets (89 packages loaded, 322 targets configured)
Analyzing: 194 targets (89 packages loaded, 322 targets configured)
Analyzing: 194 targets (110 packages loaded, 34[5](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:5) targets configured)
Analyzing: 194 targets (134 packages loaded, 730 targets configured)
Analyzing: 194 targets (135 packages loaded, 730 targets configured)
Analyzing: 194 targets (135 packages loaded, 730 targets configured)
Analyzing: 194 targets (135 packages loaded, 730 targets configured)
Analyzing: 194 targets (135 packages loaded, 730 targets configured)
DEBUG: /Users/runner/work/rules_ios/rules_ios/rules/library/xcconfig.bzl:21:12: CLANG_TRIVIAL_AUTO_VAR_INIT: "unknown" not a valid value, must be one of ["uninitialized", "pattern"]
Analyzing: 194 targets (145 packages loaded, 5738 targets configured)
Analyzing: 194 targets (145 packages loaded, 5738 targets configured)
Analyzing: 194 targets (145 packages loaded, 5738 targets configured)
ERROR: /private/var/tmp/_bazel_runner/c2342bca25[6](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:6)9556[7](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:7)c[8](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:9)c3f3be166b56b3/external/bazel_tools/tools/cpp/BUILD:57:1[9](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:10): Target '@bazel_tools//tools/cpp:current_cc_toolchain' depends on toolchain '@local_config_cc//:cc-compiler-darwin', which cannot be found: error loading package '@local_config_cc//': cannot load '@local_config_cc_toolchains//:osx_archs.bzl': no such file'
ERROR: Analysis of target '//tests/framework/platforms:framework_deps_macos_[10](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:11)1000' failed; build aborted:
INFO: Elapsed time: 265.202s
INFO: 0 processes.
FAILED: Build did NOT complete successfully ([14](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:15)5 packages loaded, 57[38](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:39) targets configured)
ERROR: Couldn't start the build. Unable to run tests
FAILED: Build did NOT complete successfully (1[45](https://github.com/bazel-ios/rules_ios/runs/7884302805?check_suite_focus=true#step:4:46) packages loaded, 5738 targets configured)
Error: Process completed with exit code 1.
Majority of flaky failures seem to be related to:
Target '@bazel_tools//tools/cpp:current_cc_toolchain' depends on toolchain '@local_config_cc//:cc-compiler-darwin', which cannot be found: error loading package '@local_config_cc//': cannot load '@local_config_cc_toolchains//:osx_archs.bzl': no such file'
More info: bazelbuild/bazel#14603
Looks like the fix for the osx_toolchain issue is to update to Bazel 6.0.0
@luispadron 💯 I've seen similar longstanding Bazel flake - e.g. over 6 years where these config rules failed under host CPU resource contention. The ones you linked are also here:
If it continues without fix, I'd propose to quickly add back a vendorized toolchains - to side-step the flake until someone addresses it
This had just sat in code review - maybe we should re-open it: bazelbuild/bazel#14328
Oh wow its been around for a while then. The issue i linked, links out to a patch made to Bazel to fix this but i cant find the commit in Bazels change logs 🤔
From the PR it looks like it should've made it to one of the 6.0 tags though so whenever that gets released we could possibly update. Would we want to use the rolling 6.0 releases on this repo at all? I see rules_apple does this
Yeah - I also think it'd be reasonable to temporarily vendorize and iterate on an autoconfig repo rules in rules_ios with the intention to upstream them back longer term