Add different ways of handling failures

Question

Add different ways of handling failures

pwr22 opened this issue 6 years ago · 0 comments

Right now if one command fails we try to stop every other command and then wait for them all to finish. This is so things always end gracefully and we always get all output (zoom was designed for some tools that are a bit flakey at flushing their output).

Other failure modes we should support:

Eventually sending SIGKILL to commands after some timeout.
Ignoring failures and continuing in a best effort to complete as much work per run as possible (later probably want to add some way of tracking what failed to allow retries).