NixOS/ofborg

PR check doesn't fail even if all `passthru.tests` fail

erikarvstedt opened this issue · 5 comments

Currently, ofborg marks failing passthru.tests as neutral (grey box), so a ofborg pkg check can succeed even if the tests fail on all platforms.

This recently happend in package update PR NixOS/nixpkgs#206835, which broke the corresponding NixOS module (later fixed here).
The ofborg check was green, and the faulty PR got merged.

Here's a simple fix that would be sufficient to detect most failures:
The PR CI check should succeed only if each test in passthru.tests succeeds on at least one platform.

(A thorough, but complex alternative is to define which platforms are supported for a test in nixpkgs. ofborg should then fail if a test on a supported platform fails.)

cc @NobbZ, @SuperSandro2000

builds not marking the pr red is intentional to avoid false-positive red-failure, mergers should check check logs themselves if they see Unexpected error: command failed
nixpkgs_issue___screenshot_2022-12-29_17-08-22_020500698

#286

NobbZ commented

In my opinion, this gray item just isn't visible enough…

About every PR has at least one gray item…

It's like turning all warnings on, but not treating them as errors… You will just not see the important things in the noise…

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/can-we-make-ofborg-show-broken-tests-as-red/10717/5

Would be useful if we could tell it apart from timeouts.

hraban commented

builds not marking the pr red is intentional to avoid false-positive red-failure, mergers should check check logs themselves if they see Unexpected error: command failed nixpkgs_issue___screenshot_2022-12-29_17-08-22_020500698

#286

Do we have numbers on false positive rate of these tests? I didn't realize this was a problem, but then again I always forget to check anyway so I'd have no idea how often it's gray.