Add different ways of handling failures
pwr22 opened this issue · 0 comments
pwr22 commented
Right now if one command fails we try to stop every other command and then wait for them all to finish. This is so things always end gracefully and we always get all output (zoom was designed for some tools that are a bit flakey at flushing their output).
Other failure modes we should support:
- Eventually sending
SIGKILL
to commands after some timeout. - Ignoring failures and continuing in a best effort to complete as much work per run as possible (later probably want to add some way of tracking what failed to allow retries).