RealOrangeOne/django-tasks

Concurrency issues on SQLite

RealOrangeOne opened this issue · 3 comments

SQLite doesn't support select_for_update, which is a key component of the database worker to ensure that a task is only picked up by a single database worker.

In Django 5.1, it's possible to change how the transactions are created. If transactions are created using EXCLUSIVE, this solves the concurrency issues, although increases the risk of locking errors (something we already work around with a retry, but it can reduce throughput). Therefore, without EXCLUSIVE, the task semantic changes from "at most once" to non-deterministic "zero or more times", which is at best unhelpful and at most a great source of confusion and complexity.

I see 2 possible solutions for this:

  1. Require that EXCLUSIVE transactions be used if SQLite is in a system check, and prevent usage without it
  2. Backport the functionality into django-tasks. This is far more complex, but it allows users of Django 4.2 and 5.0 to use SQLite.

update ... returning can be used in place of select for update, if I understand the issue correctly.

You'd still need a lock on the table to ensure 2 UPDATEs weren't running at the same time, which I think could still cause concurrency issues. I'm also not sure RETURNING is supported by Django, and I'd much rather avoid any custom SQL if we can avoid it.

It replicates a lock. Eg,

UPDATE jobs
SET started_at = NOW()
WHERE started_at IS NULL
RETURNING *
LIMIT 1;

So if a row is returned, a lock has been acquired.