Scale database reads to replicas in Rails
🍊 Battle-tested at Instacart
Add this line to your application’s Gemfile:
gem 'distribute_reads'
Makara does most of the work. First, update database.yml
to use it:
default: &default
url: postgresql-makara:///
makara:
sticky: true
connections:
- role: master
name: primary
url: <%= ENV["DATABASE_URL"] %>
- name: replica
url: <%= ENV["REPLICA_DATABASE_URL"] %>
development:
<<: *default
production:
<<: *default
Note: You can use the same instance for the primary and replica in development.
By default, all reads go to the primary instance. To use the replica, do:
distribute_reads { User.count }
Works with multiple queries as well.
distribute_reads do
User.find_each do |user| # replica
user.orders_count = user.orders.count # replica
user.save! # primary
end
end
Distribute all reads in a job with:
class TestJob < ApplicationJob
distribute_reads
def perform
# ...
end
end
You can pass any options as well.
ActiveRecord uses lazy evaluation, which can delay the execution of a query to outside of a distribute_reads
block. In this case, the primary will be used.
users = distribute_reads { User.where(orders_count: 1) } # not executed yet
Call to_a
inside the block ensure the query runs on a replica.
users = distribute_reads { User.where(orders_count: 1).to_a }
Raise an error when replica lag is too high (specified in seconds)
distribute_reads(max_lag: 3) do
# raises DistributeReads::TooMuchLag
end
Instead of raising an error, you can also use primary
distribute_reads(max_lag: 3, lag_failover: true) do
# ...
end
If you have multiple databases, this only checks lag on ActiveRecord::Base
connection. Specify connections to check with
distribute_reads(max_lag: 3, lag_on: [ApplicationRecord, LogRecord]) do
# ...
end
Note: If lag on any connection exceeds the max lag and lag failover is used, all connections will use their primary.
If no replicas are available, primary is used. To prevent this situation from overloading the primary, you can raise an error instead.
distribute_reads(failover: false) do
# raises DistributeReads::NoReplicasAvailable
end
Change the defaults
DistributeReads.default_options = {
lag_failover: true,
failover: false
}
At some point, you may wish to distribute reads by default.
DistributeReads.by_default = true
To make queries go to primary, use:
distribute_reads(primary: true) do
# ...
end
Get replication lag in seconds
DistributeReads.replication_lag
Thanks to TaskRabbit for Makara, Sherin Kurian for the max lag option, and Nick Elser for the write-through cache.
View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To test, run:
git clone https://github.com/ankane/distribute_reads.git
cd distribute_reads
createdb distribute_reads_test_primary
createdb distribute_reads_test_replica
bundle
bundle exec rake