que-rb/que

Deleting large numbers of finished jobs

Closed this issue · 2 comments

gshaw commented

We create 200-300K jobs a day and leave them in place for 7 days for analysis. We have a worker that then deletes the jobs finished older than 7 days. It all works but it can be slow and it can take longer than 60 seconds causing a Postgres statement to time out if we've a lot of extra load, e.g., 1M jobs needing to be deleted.

Even with an index on finished_at we've identified that the problem is that the que_state_notify procedure is doing a lot of work when deleting a large number of records.

I was wondering if others have had this problem and what they what others have done to solve this problem before I dive too deep into it.

@gshaw a bit late but:

BEGIN;
ALTER TABLE que_jobs DISABLE TRIGGER que_state_notify;
DELETE FROM que_jobs;
ALTER TABLE que_jobs ENABLE TRIGGER que_state_notify;
COMMIT;

of course you can add conditions to the DELETE statement (finished_at, etc.)
Check for locks on the table before running this transaction (see https://stackoverflow.com/a/43361191).

Thanks for the solution, @kofronpi! This would definitely be worth at least documenting. I'll leave the issue open to track that