saurabhnanda/odd-jobs

Modifying exponential backoff for failures

Opened this issue · 4 comments

Our team is giving odd-jobs a try and we're loving it! The only current concern we have is that we would like to avoid exponential backoff for failures since we only deal with transient failures. Would it be possible to add a config to modify it, such as adding a limit to the time added, or removing it altogether?

This is the line we are concerned with:

, jobRunAt=(addUTCTime (fromIntegral $ (2::Int) ^ jobAttempts) t)

I could open a PR to add a config but looks like the roadmap is already planning on doing something with failures per job don't want to step on toes.

Hey @aschmois great to know that you're planning to use odd-jobs. Do comment at #44 when your implementation finally makes it to production.

The only current concern we have is that we would like to avoid exponential backoff for failures since we only deal with transient failures. Would it be possible to add a config to modify it, such as adding a limit to the time added, or removing it altogether?

In a separate discussion thread, we have already established a need for evolving cfgJobRunner so that it can inform odd-jobs about how to re-queue/re-retry the job.

On similar lines I can see that your feature-request can be handled in two ways:

  1. Add a new cfgCalculateNextRunAt :: Job -> IO UTCTime function and use it at the appropriate place
  2. Or, change cfgOnJobFailed :: JobErrHandler NextJobAction, where NextJobAction is the same type as cfgJobRunner :: Job -> IO NextJobAction, and can probably look something like:
data NextJobAction = ActionSuccess | ActionFailed | ActionRetry UTCTime | ...

The second approach can be used to build cron-like functionality, as well.

What're your thoughts?

This is a good feature to have. I've added this to the roadmap

@aschmois any thoughts on my previous comments?

@saurabhnanda I apologize, we discussed it as a team and we'd like to work on this. Don't know exactly when we'll get some free time but I assume it'll be soon! I think I'd like to take a look at the NextJobAction approach since it'll help us on some of the other oddities like scheduling a job for itself after it completes (cron-like jobs).