ipfs/boxo

Setup test flakiness monitoring, reporting, and SOP

Opened this issue · 0 comments

Done Criteria

We have data insights showing our top flaky tests, and we can quantify how this is trending with time.
We also have an SOP on how we handle flaky tests. This shouldn't be tribal knowledge or guess work.

Why Important

Flaky tests will doom this endeavor because it will either:

  1. Delay getting releases out, even when a small change was made in a tiny module OR
  2. Catch us in the tar pit of relying on human judgement of whether we can merge/release or not. We won't make this judgement correct all the time, and unnecessary bugs will slip out the door.

We want to be able to quickly spot flakiness so we know where to eradicate it.

Notes

go-libp2p has tooling (including a ci-flakiness-badge) that I think we can follow/adopt/use.