Bogdanp/dramatiq

Some messages not being processed

Closed this issue · 4 comments

dlip commented

What OS are you using?

debian bookworm python 3.11

What version of Dramatiq are you using?

1.17.0

What did you do?

Submitted a task

What did you expect would happen?

To be processed

What happened?

Some messages are not being processed

A normal message has 2 log entries

/usr/local/lib/python3.11/site-packages/dramatiq/worker.py:327 Pushing message '5bc093e7-3048-47ab-9159-e756987ec2df' onto work queue.
/usr/local/lib/python3.11/site-packages/dramatiq/worker.py:481 Received message foo('bar') with id '5bc093e7-3048-47ab-9159-e756987ec2df'.

But a lost message only has 1 log entry

/usr/local/lib/python3.11/site-packages/dramatiq/worker.py:327 Pushing message '6de4d71f-8f5e-4b9a-8939-79d90c462b56' onto work queue.

I am using kafka as the message queue, but from the log it seems to be arriving ok. Could you confirm if kafka could be the issue or not? Are there any other suggestions for debugging this?

dlip commented

Another relevant point: i see other messages being pushed onto the queue and successfully processed shortly after the 6de4d71f-8f5e-4b9a-8939-79d90c462b56 log message

It’s probably best to open an issue in the linked repo about this. It seems unlikely that this is an issue with Dramatiq itself.

dlip commented

@Bogdanp if the message has been pushed onto the work queue, then wouldn't it be an issue with the workers? The delay logic in the kafka provider is blocking (its just a time.sleep), so im thinking maybe I'm running out of workers since they stuck there doing retries. Does that sound plausible?

dlip commented

The retries were the issue, also combined with a bug that it didn't convert ms to sec which made the wait 1000x longer