shiblon/entroq

allow adjusting poll time for worker

Closed this issue · 3 comments

I'm trying to reduce DefaultClaimPollTime for the worker, but worker has no way to configure that.
Should we allow configuring the claim options for the worker? Currently all ClaimOpts are allowed to be configired for the worker except PollTime. I don't think it should matter how much the polltime is, but if all workers are busy and one woker is waiting for a claim, it introduces a ~30s delay (DefaultClaimPollTime) in getting to the next task

task, err := w.eqc.Claim(ctx, From(w.Qs...), ClaimFor(w.lease))

This could be a solution

type Worker struct {
	//
       pollTime time.Duration
}


func WithPollTime(d time.Duration) WorkerOption {
	return func(w *Worker) {
		w.pollTime = d
	}
}

func (w *Worker) Run(ctx context.Context, f Work) (err error) {
	//
		task, err := w.eqc.Claim(ctx, From(w.Qs...), ClaimFor(w.lease), ClaimPollTime(w.pollTime))
         //
}

That's not actually how that works. The poll time should almost never bite you in any way, and decreasing it has very real implications for system health. You can easily spam a network to death with that reduction.

When Claim is run, it immediately returns a claimable task, or it waits for another one. The poll time is a fallback mechanism due to network instability and some (perhaps fixed by now) gRPC bugs that caused connections to fail without dropping after a little while.

The notify/wait mechanism implemented in subq is handling that: any client that does a claim will be notified immediately when a task becomes available to claim. The one exception to this rule is when a task becomes available because it's Available Time passed (vs. being modified or inserted). In that case, yes, you might need to wait a bit longer, but the assumption there is that if the task wasn't really available when you started listening, a few extra seconds won't hurt: available time is not a deadline, it's a way of ensuring that other workers don't claim things that are already being maintained and worked on.

So, unless you are seeing real problems with this, I would like to suggest that we not make any changes. If you are seeing real problems, where you have one worker waiting on an empty queue, and then inserting a task doesn't cause it to be immediately picked up, I would like to hear about that because it's a bug :)

I see I wasn't using WithNotifyWaiter with the subq on the pg backend. Makes sense now :)
After I added the subq waiter, I noticed the issue wasat the Claim call within the worker that waits for a value in a specific response queue. Thank you for the clarification, great help as usual :)

My pleasure! Also, a key thing to note if you poke around the code some more (which is totally encouraged, it just didn't happen to need the change this time!) is this: if I start a variable name with the word Default, that means it's really a default, and should never be configurable. Anything configurable won't start with Default. Instead it will fall back to the default value if the configurable value is not explicitly set. If that makes sense.

Basically, for configurable values, you have to const/variables:

SomeValue and DefaultSomeValue. The first is configurable, and falls back to the second, which is not configurable.

Sometimes there is no non-default value, and IIRC that's the case here, which means it just isn't configurable at all :)