mattn/go-sqlite3

database is locked

stevenh opened this issue ยท 38 comments

We have a little app which has multiple goroutines all running queries and it often fails with "database is locked" :(

we're running with:

db, err := sql.Open("sqlite3", "test.db?cache=shared&mode=rwc")
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;

sqlite3 -version
3.10.2 2016-01-20 15:27:19 17efb4209f97fb4971656086b138599a91a75ff9

Is this a known issue?

mattn commented

did you try _busy_timeout ?

I wasn't aware of this, so no I hadn't.

I would however has expected busy_timeout to only effect table locks but the error says "database is locked" is the error misleading?

Just chiming in here, because I keep running into the same problem. I created a gist that can pretty reliably demonstrate the "database is locked" error. In this program I open 1 goroutine that writes 5000 times to one table (users) and another goroutine that spawns 5000 goroutines that read from another table.

db, err := sql.Open("sqlite3", "database_file.sqlite?cache=shared&mode=rwc") doesn't fix the problem, and adding _busy_timeout makes the program really, really slow.

What helped was using db.SetMaxOpenConns(1). Which is the same as wrapping a mutex around every DB access.

I really want to understand what's going on here and what's possible with sqlite3 in Go and what's not. My guess is that the database is locked error isn't a problem in sqlite (which explicitly allows writing to and reading from different tables at the same time), but rather a combination of sqlite's thread-safety, Go's database/sql connection pooling and go-sqlite3 connection management.

That still results in "database is locked" errors and I think it's even more than before.

Yer, I think the issue is with multiple threads and this message is equivalent to MySQL's deadlock detected, at least in our case, just MySQL's message is much clearer as it doesn't have multiple meanings which seems to be the case with sqlite.

As none of the suggestions resolved the issue we've had to move away from sqlite unfortunately.

I'll leave this bug open for others to contribute to, as its clearly wider spread.

After some more reading and investigating, I think there is no proper solution where we can use multiple sql.DB connections in combination with sqlite3. Why?

  • The thread-safety guarantees sqlite makes only cover the case where you have one sqlite3 *db in your program and access it from different threads. sqlite can then coordinate access to the database in its own functions.
  • If we don't limit the amount of sql.DB connections (which is the database/sql default), we create multiple sqlite3 *db instances in our program. These act independently from each other. Now each connection has to do its own synchronization and they have to synchronise with the the other connections. We're now essentially in "multiple processes accessing the same file"-land. The sqlite FAQ describes this:

SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the duration of its update. But that normally only takes a few milliseconds. Other processes just wait on the writer to finish then continue about their business.

(Highlight by me)

The difference between sqlite and client/server databases (PostgreSQL, MySQL, ...) is that sqlite3 synchronizes access to the database file on the client-side, which is only possible if multiple threads/goroutines use the same client. If multiple clients (processes, goroutines with their own connection) use the same database file, the synchronization has to happen through the database file, in which case it's locked completely for every write action.

So I'd say that the only solution is to use db.SetMaxOpenConns(1) (or use the busy timeout).

If anything of what I just said is wrong, please feel free to point it out.

EDIT:

Basically everything I said above is wrong.

Apparently what I said about "thread-safety guarantees" only in "shared db instance across threads" is wrong. I found some sample code that shows usage of multiple db instances across multiple threads. Apparently that is the recommended way to go.

This sqlite wiki entry also shows how sqlite behaves when used in multiple threads with multiple database instances: a lot of SQLITE_BUSY/"database is locked" errors. The conclusion on the wiki page is as follows:

Make sure you're compiling SQLite with -DTHREADSAFE=1.
Make sure that each thread opens the database file and keeps its own sqlite structure.
Make sure you handle the likely possibility that one or more threads collide when they access the db file at the same time: handle SQLITE_BUSY appropriately.

Using one connection across multiple goroutines by using db.SetMaxOpenConns(1) and letting database/sql synchronize access to the shared connection with its mutexes might cause problems, since connections in database/sql are stateful, which means that a db.Begin() in one goroutine effects other goroutines.

To update the conclusion at the top of the original version of this comment:

I think there is no proper solution where we can use multiple sql.DB connections in combination with sqlite3 in Go without handling "database is locked" by retrying or by incrementing the busy_timeout. It seems like this is expected behaviour when using sqlite in a multi-threaded context.

Yer, I think the issue is with multiple threads and this message is equivalent to MySQL's deadlock detected, at least in our case, just MySQL's message is much clearer as it doesn't have multiple meanings which seems to be the case with sqlite.

sqlite's "database is locked" error doesn't have multiple meanings. It means "I tried to obtain a lock on the database but failed because a separate process/connection held a conflicting lock". It's certainly not a deadlock, and is 100% expected in code creating multiple connections to the same DB.

My guess is that the database is locked error isn't a problem in sqlite (which explicitly allows writing to and reading from different tables at the same time),

To be more specific, sqlite's locking protocol allows for many readers and exactly one writer at any given time. ie. if you attempt to start two write transactions concurrently, the second will encounter SQLITE_BUSY.

In rollback-journal mode, a read transaction will also encounter SQLITE_BUSY if it starts just as a write transaction enters its COMMIT phase - as hinted at in previous comments sqlite locks the entire database for the duration of COMMIT. This doesn't happen in WAL mode, because COMMIT doesn't need to update the main database file (just the write ahead journal).

Anyway I think you came to the same conclusion, that SQLITE_BUSY is unavoidable when using multiple connections to the same database. Shared cache mode helps by introducing finer locks (table-level instead of database-level), but still if you have two transactions trying to update the same table at once the second will get SQLITE_BUSY.

Setting the busy timeout is one way to cope with the situation - if your timeout is longer than the longest write transaction performed by your application that will mitigate many of the errors.

But even then, you can still easily get SQLITE_BUSY via a transaction along the lines of:

BEGIN;
SELECT MAX(x) FROM table1;
INSERT INTO table1 VALUES(x+1);
COMMIT;

If there is already a write transaction in progress on table1 when this INSERT statement runs, sqlite will immediately return SQLITE_BUSY, without waiting for the busy timeout.

In rollback-journal mode, this is because sqlite knows it has a read-lock (from the earlier SELECT), so waiting is just going to hold up the existing write transaction.

In WAL mode, this is because the earlier SELECT has the effect of pinning this transaction's view of the database to that point in time. sqlite knows the existing write transaction is going to update database potentially invalidating the SELECT results, and doesn't let you write to the database from this view of the past.

For this conflicting writer case, the proper course of action is for the application to ROLLBACK the transaction, and potentially try again once the other write transaction has completed.

I don't think there is any issue with go-sqlite3 here.

fwiw, I stumbled across this ticket and subsequently found http://stackoverflow.com/questions/32479071/sqlite3-error-database-is-locked-in-golang

After the addition rows.Close() calls to my code I don't seem to be experiencing the database is locked issue.

jrots commented

Still experiencing this issue atm..
So what is the best solution/ practices here to avoid these "locked issues" ?

  • creating a writer connection with : db.SetMaxOpenConns(1) +
  • a separate reader connection for selects without SetMaxOpenConns() ?
  • avoid using tx.Begin() and Commit() as it will block other connections?

@jrots sqlite allows multiple readers but only a single writer at any one time. If you have multiple connections to the same DB, you will inevitably run into database is locked. This is an unavoidable consequence of sqlite's locking model, and therefore something your application needs to deal with. Possible solutions include:

  1. Only ever open a single connection to the DB (but note this can still result in multiple connections to the DB if a user runs multiple instances of your application at once)
  2. Set the busy timeout to something reasonably high (which causes sqlite to retry lock acquisition for a certain amount of time before returning database is locked). In go-sqlite3 this is achieved by appending eg. ?_busy_timeout=5000 to the filename passed to sql.Open, where 5000 is the number of milliseconds to retry.

Also note that tx.Begin() and tx.Commit() are currently affected by issue #184 which can result in a connection getting stuck in an inconsistent state if SQLITE_BUSY is encountered while commit is in progress. But setting an appropriate busy timeout should avoid this situation also (unless you have excessively long lived transactions in which case you might want to rethink your design and/or ensure that you're closing all your transaction/result objects in a timely manner).

jrots commented

@sqweek Thx for the thorough explanation! Are you also using db.SetMaxOpenConns(1) when initialising the connection or just avoid having that setting?

update: ok reread your answer & you probably don't use that, just one initialisation and let it underlying create multiple connections without that setting

roxma commented

@mrnugget

since connections in database/sql are stateful, which means that a db.Begin() in one goroutine effects other goroutines.

I'm confused. I can't find the offical database/sql documentation on this, so here's what I found and looks sensible to me:

http://go-database-sql.org/modifying.html

In Go, a transaction is essentially an object that reserves a connection to the datastore

Which implies that the goroutine which calls the db.Begin() will own the connection, until the transaction ends with db.Commit() or db.Callback()

If it is true, how does this behavior affectes other groutines?

at windows 10, it can run very well, but in centos7.3 database is locked error random occur. _busy_timeout is invalid. but db.SetMaxOpenConns(1) looks run well.

@mattn You should change this line in sqlite3.go:

#cgo CFLAGS: -DSQLITE_ENABLE_RTREE -DSQLITE_THREADSAFE

as the following:

#cgo CFLAGS: -DSQLITE_ENABLE_RTREE -DSQLITE_THREADSAFE=1

It fixes the problem. SQLITE_THREADSAFE must be 1 because a single DB connection can be used from multiple threads when using it from multiple goroutines. See https://www.sqlite.org/threadsafe.html and https://sqlite.org/compile.html#threadsafe

https://golang.org/src/database/sql/doc.txt says:

Given an *sql.DB, it should be possible to share that instance between multiple goroutines, without any extra synchronization.

mattn commented

@sqlitefan Thank you!

If you still have issue, please reopen this. ๐Ÿ‘

It seems the problem isn't fixed yet... ๐Ÿ˜ข

mattn commented

Well, I don't make sure what is your issue but you can't avoid busy timeout. SQLite is not provide infinite blocking.

FYI: this option can be also checked at runtime:

sqlite> SELECT sqlite_compileoption_get(8);
THREADSAFE=1

[edit] see @Papakai comment below

mattn commented

go-sqlite3 can't implement queue for waiting busy because it is contrary of the design of SQLite3. If it's possible to do, sqlite3 already would implemented this feature.

@missinglink
Better way is

PRAGMA compile_options;

So, what is the final solution? How can I handle this error? or How can I avoid this error?

mattn commented

You can check this error with err.(sqlite3.Error).Code == sqlite3.ErrLocked.

You can avoid it by only ever having a single active connection to the DB file. SQLite can only handle one writer at a time, so as soon as you have multiple connections (either in a single process or multiple instances of your program or whatever) and one of them is updating the DB there's a period of time in which it is inaccessible to other connections (they will return SQLITE_BUSY ie. "database is locked").

If you're seeing "database is locked" with a single connection, you probably have a statement hanging around which you've forgotten to close. Otherwise "database is locked" is expected behaviour and you'll have to handle it appropriately (usually by retrying the transaction, or using a busy_timeout so that SQLite automatically retries for a period of time).

@sqweek I am closing all my statements. But db.Exec function throwing an error.
code - https://github.com/vkuznet/transfer2go/blob/master/core/catalog.go#L345-L349

You should check and handle errors instead of ignoring them, for one. Also you should probably open a new issue with more details describing the deadlock, as deadlock is a completely different scenario to "database is locked".

I would assume a simple solution would be either A) using mutexes on the actual query code, or B) using a channel that receives a struct with 1) query 2) params & 3) a response channel. Then any go routine anywhere in the app can run queries while having a single "moderator" insuring the queries are only ever run one-at-a-time.

Mutexes would probably be the simple way to solve this as they would force a one-at-a-time access pattern without any changes to the app query structure.

@mrnugget has a great example of this: https://gist.github.com/mrnugget/0eda3b2b53a70fa4a894

adding cache=shared to DSN and compiling with sqlite_unlock_notify build tag seems to work fine.

fwiw, I stumbled across this ticket and subsequently found http://stackoverflow.com/questions/32479071/sqlite3-error-database-is-locked-in-golang

After the addition rows.Close() calls to my code I don't seem to be experiencing the database is locked issue.

This solved my issue

Not sure if this is documented elsewhere, but I don't think the busy timeout makes any difference in my application unless I set the transaction lock parameter to "immediate" (so that the BEGIN statements are BEGIN IMMEDIATE, I think).

Once that's set, the busy timeout is in effect, and things work much better.

Maybe that's obvious to others, but it wasn't to me. So I thought I'd mention it here in case anybody else comes looking for it.

Also possible I'm misunderstanding what I'm seeing.

@runderwood That is not the case. With a rollback journal, the busy handler will be invoked if you attempt to read while another connection is writing, or vice versa. With WAL, those situations are not generally blocking so you won't normally observe the handler getting invoked. In both modes, starting an immediate transaction while another connection holds a write lock will invoke the busy handler.

Note that there are some cases that can arise where SQLite will detect that invoking the busy handler is pointless and thus returns SQLITE_BUSY without invoking it. This in particular can happen if you use BEGIN or BEGIN DEFERRED and your transaction first reads, then another connection writes, and then your transaction tries to write. For this reason, it is best to use BEGIN IMMEDIATE for any transactions that could write.

I feel like we're saying the same thing, at least mostly.

But I can confidently say that, in my application, in WAL journal mode, if my transactions are initiated with the default (plain BEGIN), then the busy timeout seems to never be used. I get many immediate "database is locked" errors with concurrent write attempts.

When I set this library's "_tx_lock" option to "immediate" (which initiates transactions with BEGIN IMMEDIATE), the busy timeout is observed and I see far, far fewer if any "database is locked" errors.

As I read your comment, particularly the last sentence, I feel like you're confirming that this is the behavior one ought to expect.

Of special interest is the scenario you describe -- which perfectly matches the situation that led me to here:

Note that there are some cases that can arise where SQLite will detect that invoking the busy handler is pointless and thus returns SQLITE_BUSY without invoking it. This in particular can happen if you use BEGIN or BEGIN DEFERRED and your transaction first reads, then another connection writes, and then your transaction tries to write.

I believe this is what was happening to trigger immediate "database is locked" responses in my application. And, as you note, reconfiguring this to use BEGIN IMMEDIATE improves the outcome immensely (insofar as the busy timeout is actually in play).

My purpose in posting here is this: reading through the docs and the comments here, it was not apparent to me that setting the busy timeout without changing the _tx_lock parameter to "immediate" would not make an appreciable difference.

in WAL journal mode, if my transactions are initiated with the default (plain BEGIN), then the busy timeout seems to never be used

This is usually true, because WAL lifts the restrictions on concurrent reads/writes. However, if you use BEGIN (or BEGIN DEFERRED), and the very first operation is to write, then the busy handler will be invoked if some other connection currently holds a write lock.

In other words, it is not technically the case that the busy handler only has an effect when you use BEGIN IMMEDIATE. It's just (a) the other cases where you can get SQLITE_BUSY in WAL mode are less likely to occur, and (b) the situation I described above skips the handler entirely (regardless of whether you are using WAL or rollback journal). Thus it may give the false impression that it does.

Now, to be clear, there has been a suggestion in the past to use shared cache mode to avoid some "database is locked" issues, such as the aforementioned one. However, this is bad advice. Do not use shared cache mode. It only pushes the problem around, and can lead to similar looking "database table is locked" errors. The proper fix is (1) to use BEGIN IMMEDIATE for any transactions that may write to the database, and (2) use an explicit *sql.Tx in Go if you will need to write to the database while iterating through a result set.

So, just to understand more clearly: why do concurrent transactions initiated with plain BEGIN and which first read then write not invoke the busy handler?

The reason is that SQLite (the C library) is acting on limited information and making some pessimistic assumptions:

  1. the data that you read may have been changed by the other writer
  2. the data that you read may have factored into your decision of what to write

Consequently it cannot allow the transaction to succeed, since doing so would violate ACID (namely, isolation). Thus there is no point in invoking the busy handler, since the result will always be the same.

That makes sense and is consistent with my experience.

Had I read the SQLite transaction docs more closely, I would have sooner realized that the default transaction mode is DEFERRED and that the behavior I've observed (and which you've described) is exactly as one should expect:

The default transaction behavior is DEFERRED...DEFERRED means that the transaction does not actually start until the database is first accessed. Internally, the BEGIN DEFERRED statement merely sets a flag on the database connection that turns off the automatic commit that would normally occur when the last statement finishes. This causes the transaction that is automatically started to persist until an explicit COMMIT or ROLLBACK or until a rollback is provoked by an error or an ON CONFLICT ROLLBACK clause. If the first statement after BEGIN DEFERRED is a SELECT, then a read transaction is started. Subsequent write statements will upgrade the transaction to a write transaction if possible, or return SQLITE_BUSY. If the first statement after BEGIN DEFERRED is a write statement, then a write transaction is started.

๐Ÿ‘

@sqweek not solve it since sqlite3 run in serialised mode when THREADSAFE=1 is used

#cgo CFLAGS: -DSQLITE_ENABLE_RTREE -DSQLITE_THREADSAFE=1

Just chiming in here, because I keep running into the same problem. I created a gist that can pretty reliably demonstrate the "database is locked" error.

It seems the issue is with using double quotes instead of single quotes when providing string values (in your example you used "Bobby". Try 'Bobby' instead)

I was getting the same error even without any routines or threads. Using single-quotes fixed the issue.