Stateless Postgres Query Router

PostgreSQL is awesome, but it's hard to manage a single database with some terabytes of data and 10⁵+ queries per second. Current sharding solutions focus on analytical and hybrid workloads (OLAP, HTAP). Moreover, most of those solutions do not provide a smooth path for monolith to sharded transitions, which is why Yandex Cloud's Data Platform team developed SPQR.

SPQR is a production-ready system for horizontal scaling of PostgreSQL via sharding. We appreciate any kind of feedback and contribution to the project.

For more about SPQR, please see docs/.

Main features

SPQR works well when most of your queries can be executed on one shard.

Sharding. If possible, the router tries to determine on the first transaction statement to which shard this transaction should be sent. But you can explicitly specify a shard or a sharding key in a comment request.
Transaction and session pooling. Just as in your favorite connection pooler (Odyssey or PgBouncer).
Multiple routers for fault tolerance. The router stores the sharding rules only for cache purposes. Information about the entire installation is stored inside the QDB service, so the number of routers running simultaneously is unlimited.
Liquid data migrations. Data migration between shards aims to balance the workload across shards proportionally. The main idea is to minimize any locking impact during these migrations, which is accomplished by reducing the size of the data ranges being transferred.
Limited cross-shard queries. SPQR router supports limited subset of cross-shard queries. This is made from best-effort logic in a non-disruptive and non-consistent way and is used mainly for testing purposes. Please do not use this feature in your production, cross-shard snapshot will be inconsistent.
Multiple servers and failover. In the router configuration, it is possible to specify multiple servers for one shard. Then the router will distribute read-only queries among the replicas. However, in addition to the automatic routing, you also have the option to explicitly define the destination for a specific query by using the target-session-attr parameter within the query.
Works over PostgreSQL protocol. It means you can connect to the router and the coordinator via psql to perform administrative tasks.
Dedicated read-only mode. Once enabled, the router will respond to a SHOW transaction_read_only command with "true" and handle only read-only queries, similar to a standard PostgreSQL replica.
Minor overhead for query execution. See benchmarks here and here.
Varias authentication types. From basic OK and plain text to MD5 and SCRUM, see Authentication.md.
Live configuration reloading. You can send a SIGHUP signal to the router's process. This will trigger the router to reload its configuration file and apply any changes without interrupting its operation.
Statistics. You can get access to statitics in router's administrative console via SHOW command.
Fallback execution of unrouted queries to the "world shard". SPQR is optimized for single-shard OLTP queries. But we have long-term plans to support routing queries for 2 or more shards.

Development

You can use make run for a quick example using Docker. For local development, you need the latest Go version.

How to build:

make
make build

How to run:

spqr-router run --config path-to-router-config.yaml

Tests

SPQR has all types of tests: unit, regress, and end-to-end. These tests require Docker Compose, and can be run using make. For more information on testing, please see unittest, regress, and feature sections in Makefile.

License

The SPQR source code is distributed under the PostgreSQL Global Development Group License.

Chat