kubernetes-sigs/e2e-framework

Why does `-parallel` and `-fail-fast` can't be used together?

Fricounet opened this issue ยท 7 comments

Hello ๐Ÿ‘‹
I'm trying to understand why flags -parallel and -fail-fast can't be used together. I've looked at the PR that implemented the fail-fast but couldn't find a reason as to why they can't be used together.
I'm asking because I've been using -fail-fast with t.Parallel() in my tests without running into any issue so I wonder what is different with -parallel?

@Fricounet
Apologies for the delay.
So, you are saying that you can run tests with --fail-fast and call to t.Parallel but can't do it with --parallel flag ?
Do you get an error ?

@vladimirvivien no worries
Well the error I get when running -fail-fast and parallel is part of the framework itself. It is defined here https://github.com/kubernetes-sigs/e2e-framework/blob/main/pkg/flags/flags.go#L283

That's why I don't quite understand why ๐Ÿ˜… I thought that maybe I'm just missing some context on why this decision was made initially

@Fricounet Looking at the code I think I remember why it was encoded like that.
Maybe the thinking was that if the framework is executing tests concurrently, fail-fast would have to be applied to all tests inflight and end the entire test run. Having that mutual exclusivity makes it easier to reason about.

If you have an idea that can make this better, you should definitely discuss here. Again, a lot of those decisions were done early on, so they may be due for a revisit.

@vladimirvivien What do you think about canceling the parent context if fail-fast is enabled? This way all currently running tests would have to stop too?

I can try to take a look and see if I can come up with a PR.
Thanks for the context anyway :)

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@Fricounet are you still interested in working on this ?

@vladimirvivien yes! I'll try to work on this next week