500 "attempt to write a readonly database" error caused by "PRAGMA schema_version"
Opened this issue · 9 comments
I've not been able to replicate this myself yet, but I've seen log files from a user affected by it.
File "/usr/local/lib/python3.11/site-packages/datasette/views/base.py", line 89, in dispatch_request
await self.ds.refresh_schemas()
File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 371, in refresh_schemas
await self._refresh_schemas()
File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 386, in _refresh_schemas
schema_version = (await db.execute("PRAGMA schema_version")).first()[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 267, in execute
results = await self.execute_fn(sql_operation_in_thread)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 213, in execute_fn
return await asyncio.get_event_loop().run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 211, in in_thread
return fn(conn)
^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 237, in sql_operation_in_thread
cursor.execute(sql, params if params is not None else {})
sqlite3.OperationalError: attempt to write a readonly database
That's running the official Datasette Docker image on https://fly.io/ - it's causing 500 errors on every page of their site.
I thought it might relate to the "defensive mode" issue described here:
But I have since determined that the Datasette official Docker image does NOT run anything in defensive mode, so I don't think it's related to that.
Asked on the SQLite Forum if anyone has seen this before: https://sqlite.org/forum/forumpost/793a2ed75b
Here's the relevant code:
Lines 421 to 437 in 5890a20
This function is called on almost every request (everything that subclasses BaseView
at least - need to remember that for the refactor in #2053 etc).
datasette/datasette/views/base.py
Lines 101 to 103 in 5890a20
It uses PRAGMA schema_version
as a cheap way to determine if the schema has changed, in which case it needs to refresh the internal schema tables.
This was already the cause of a subtle bug here:
Here's a potential workaround: when I store the schema versions, I could also score an MD5 hash of the full schema (select group_concat(sql) from sqlite_master
). When I read the schema version with PRAGMA schema_version
I could catch that exception and, if I see it, I could calculate that MD5 hash again as a fallback and use that to determine if the schema has changed instead.
The performance overhead of this needs investigating - how much more expensive is md5(... that SQL query result)
compared to just PRAGMA schema_version
, especially on a database with a lot of tables?
I got ChatGPT (code execution alpha) to run a micro-benchmark for me. This was the conclusion:
The benchmark using
PRAGMA schema_version
is approximately 1.36 times faster than the benchmark usinghashlib.md5
for the case with 100 tables. For the case with 200 tables, the benchmark usingPRAGMA schema_version
is approximately 2.33 times faster than the benchmark usinghashlib.md5
.
Here's the chart it drew me:
(It's a pretty rubbish chart though, it only took measurements at 100 and 200 and drew a line between the two, I should have told it to measure every 10 and plot that)
And the full transcript: https://gist.github.com/simonw/2fc46effbfbe49e6de0bcfdc9e31b235
The benchmark looks good enough on first glance that I don't feel the need to be more thorough with it. PRAGMA schema_version
is faster, but not so fast that I feel like the MD5 hack is worth worrying about too much.
I'm tempted to add something to the /-/versions
page that tries to identify if this is a problem or not though.
Really interesting how you are using ChatGPT in this.
I tried deploying the fix you submitted, but still getting the same errors. I can past the logs here if needed, but I really don't see anything new in them.