Why isn't a connection pool being used?
whatamithinking opened this issue · 1 comments
Description
Apologies if I am missing the obvious.
Other similar libraries (aiomy, aioodbc, aiopg, etc.) all have connection pooling built in. The other libraries all seem to take a similar approach where they wrap a library that works with multiple threads but not with async tasks.
Since this library seems to be widely used, as opposed to aioodbc which also supports SQLite but is not in as widespread use, I am a little confused as to why a connection pool is not being used to increase concurrency.
Details
- OS: Windows 10
- Python version: 3.9
- aiosqlite version: 0.17.0
- Can you repro on 'main' branch?
- Can you repro in a clean virtualenv?
Generally speaking, the other DB libraries are pooling connections because a) making a network connection takes a significant amount of time between TCP handshake, etc, so having connections already made and waiting is faster; and b) because network bandwidth is limited, so preventing too many simultaneous connections can help ensure better network utilization.
However, since sqlite by definition is a local database, either on-disk or in memory, the benefits from connection pooling are minimal: there is no network delay to opening a file on disk or allocating memory, and disk bandwidth is much higher (and better managed by the kernel). "Connections" are lightweight compared to mysql or postgres, and can generally be opened or closed at will. The biggest cost to a new connection will likely be spinning up another thread.
I added a simple perf test to measure speed of creating connections, and on my M1 Mac Mini with Python 3.10.2, I was able to open ~4700 connections per second from a file, or ~5300/s for in-memory DBs:
(.venv) user@mordin ~/workspace/aiosqlite main » python -m unittest aiosqlite.tests.perf.PerfTest -k test_connection
Running perf tests for at least 2.0s each...
..
Perf Test Iterations Duration Rate
connection_file 9454 2.0s 4726.7/s
connection_memory 10504 2.0s 5251.8/s
----------------------------------------------------------------------
Ran 2 tests in 4.005s
When it comes to concurrency, you can simply create more connections with aiosqlite, without needing a connection pool. Pooling connections could potentially still help in that regard if you expect to be making a very large number of concurrent requests and want to limit the number of background threads, though you will likely end up with contention on the limited number of connections in the pool instead.
My general suggestion for most use cases with aiosqlite is to just keep the code simple, and open a new connection using a context manager anywhere in the code you need to talk to the db. This also helps keep transactions isolated, and prevents a wide class of bugs around managing queries on shared connection threads.