Zero downtime migrations with ActiveRecord and PostgreSQL.
Catch problematic migrations at development/test time! Heavily inspired by these similar projects:
Simply add this gem to the project Gemfile
.
gem "zero_downtime_migrations"
This gem will automatically raise exceptions when potential database locking migrations are detected.
It checks for common things like:
- Adding a column with a default
- Adding a non-concurrent index
- Mixing data changes with index or schema migrations
- Performing data or schema migrations with the DDL transaction disabled
- Using
each
instead offind_each
to loop thruActiveRecord
objects
These exceptions display clear instructions of how to perform the same operation the "zero downtime way".
This can take a long time with significant database size or traffic and lock your table!
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
add_column :posts, :published, :boolean, default: true
end
end
First let’s add the column without a default. When we add a column with a default it has to lock the table while it performs an UPDATE for ALL rows to set this new default.
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
add_column :posts, :published, :boolean
end
end
Then we’ll set the new column default in a separate migration. Note that this does not update any existing data! This only sets the default for newly inserted rows going forward.
class SetPublishedDefaultOnPosts < ActiveRecord::Migration[5.0]
def change
change_column_default :posts, :published, true
end
end
Finally we’ll backport the default value for existing data in batches. This should be done in its own migration as well. Updating in batches allows us to lock 1000 rows at a time (or whatever batch size we prefer).
class BackportPublishedDefaultOnPosts < ActiveRecord::Migration[5.0]
def change
Post.select(:id).find_in_batches.with_index do |batch, index|
puts "Processing batch #{index}\r"
Post.where(id: batch).update_all(published: true)
end
end
end
This action can lock your database table while indexing existing data!
class IndexUsersOnEmail < ActiveRecord::Migration[5.0]
def change
add_index :users, :email
end
end
Instead, let's add the index concurrently in its own migration with the DDL transaction disabled.
This allows PostgreSQL to build the index without locking in a way that prevent concurrent inserts, updates, or deletes on the table. Standard indexes lock out writes (but not reads) on the table.
class IndexUsersOnEmail < ActiveRecord::Migration[5.0]
disable_ddl_transaction!
def change
add_index :users, :email, algorithm: :concurrently
end
end
Performing migrations that change the schema, update data, or add indexes within one big transaction is unsafe!
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
add_column :posts, :published, :boolean
Post.update_all(published: true)
add_index :posts, :published
end
end
Instead, let's split apart these types of migrations into separate files.
- Introduce schema changes with methods like
create_table
oradd_column
in one file. These should be run within a DDL transaction so that they can be rolled back if there are any issues. - Update data with methods like
update_all
orsave
in another file. Data migrations tend to be much more error prone than changing the schema or adding indexes. - Add indexes concurrently within their own file as well. Indexes should be created without the DDL transaction enabled to avoid table locking.
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
add_column :posts, :published, :boolean
end
end
class BackportPublishedOnPosts < ActiveRecord::Migration[5.0]
def change
Post.update_all(published: true)
end
end
class IndexPublishedOnPosts < ActiveRecord::Migration[5.0]
disable_ddl_transaction!
def change
add_index :posts, :published, algorithm: :concurrently
end
end
The DDL transaction should only be disabled for migrations that add indexes. All other types of migrations should keep the DDL transaction enabled so that changes can be rolled back if any unexpected errors occur.
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
disable_ddl_transaction!
def change
add_column :posts, :published, :boolean
end
end
class UpdatePublishedOnPosts < ActiveRecord::Migration[5.0]
disable_ddl_transaction!
def change
Post.update_all(published: true)
end
end
Any other data or schema changes must live in their own migration files with the DDL transaction enabled just in case they make changes that need to be rolled back.
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
add_column :posts, :published, :boolean
end
end
class UpdatePublishedOnPosts < ActiveRecord::Migration[5.0]
def change
Post.update_all(published: true)
end
end
This might accidentally load tens or hundreds of thousands of records into memory all at the same time!
class BackportPublishedDefaultOnPosts < ActiveRecord::Migration[5.0]
def change
Post.all.each do |post|
post.update_attribute(published: true)
end
end
end
Let's use the find_each
method to fetch records in batches instead.
class BackportPublishedDefaultOnPosts < ActiveRecord::Migration[5.0]
def change
Post.all.find_each do |post|
post.update_attribute(published: true)
end
end
end
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
rename_column :posts, :published, :published_new_name
end
end
- Introduce schema changes with methods like
create_table
oradd_column
- Copy data with methods like
update_all
- Introduce code to write to both columns
- Deploy
- Add code to stop writing to original column
- Deploy
- Drop the column in last separate deploy
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
add_column :posts, :published_DEPRECATED, :boolean #published is original column
Post.update_all("\"published_DEPRECATED\"=published") #need to quote capitals in postgres
end
end
class CopyPublishedDataOnPosts < ActiveRecord::Migration[5.0]
def change
remove_column :posts, :published
end
end
- Changing a column type
- Removing a column
- Renaming a table
We can disable any of these "zero downtime migration" enforcements by wrapping them in a safety_assured
block.
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
def change
safety_assured do
add_column :posts, :published, :boolean, default: true
end
end
end
We can also mark an entire migration as safe by using the safety_assured
helper method.
class AddPublishedToPosts < ActiveRecord::Migration[5.0]
safety_assured
def change
add_column :posts, :published, :boolean
Post.where("created_at >= ?", 1.day.ago).update_all(published: true)
end
end
Enforcements can be globally disabled by setting ENV["SAFETY_ASSURED"]
when running migrations.
SAFETY_ASSURED=1 bundle exec rake db:migrate --trace
These enforcements are automatically disabled by default for the following scenarios:
- The database schema is being loaded with
rake db:schema:load
instead ofdb:migrate
- The current migration is a reverse (down) migration
- The current migration is named
RollupMigrations
bundle exec rspec
- Fork the project.
- Make your feature addition or bug fix.
- Add tests for it. This is important so we don't break it in a future version unintentionally.
- Commit, do not mess with the version or history.
- Open a pull request. Bonus points for topic branches.
MIT - Copyright © 2016 LendingHome