Bi-Directional and Self-Referential Associations in Rails
I've been working on an application that works to match users together based on a complex set of criteria (read: big slow database query and in-memory processing). The core usage of the application revolves around these user matches, so I want to make sure that either the algorithm will run very fast or can be cached so that it's not run every time a user visits their matches page.
The most important requirement for our matches is that a match for one user also needs to be a match for the other user. Formally, ∀ x,y ∈ Users: f(x) ∋ y -> f(y) ∋ x
: for all users
x
and y
, if matched_users
belonging to x
contains y
, then matched_users
belonging to y
must also contain x
. It should automatically stay in sync from both sides of the relationship. The matching algorithm will do that (slowly), but we also care about some of the metadata behind a match, like how long users have been considered a match.
To solve this problem and meet all of the requirements, we can create a bi-directional, self-referential, self-syncing, many-to-many association between users using a has_many :through
association with a join model to keep track of a user's matches.
Lets start by creating our join model, Match
, to belong to users via the user_id
and matched_user_id
columns:
# db/migrations/create_matches.rb
class CreateMatches < ActiveRecord::Migration
def change
create_table :matches do |t|
t.references :user, index: true, foreign_key: true
t.references :matched_user, index: true
t.timestamps
end
add_index :matches, [:user_id, :matched_user_id], unique: true
add_foreign_key :matches, :users, column: :matched_user_id
end
end
# app/models/match.rb
class Match < ActiveRecord::Base
belongs_to :user
belongs_to :matched_user, class_name: "User"
end
And then add our has_many
and has_many :through
associations to our User
model:
# app/models/user.rb
class User < ActiveRecord::Base
has_many :matches
has_many :matched_users, through: :matches
end
This is pretty straightforward. Now if we have a user Alice and add Bob to her matched users collection, we will see that it contains Bob:
alice = User.find_by(email: 'alice@example.com')
bob = User.find_by(email: 'bob@example.com')
alice.matched_users << bob
alice.matched_users # => [bob]
However, if we look from Bob's point of view, we can't see that he is matched to Alice:
bob.matched_users # => []
But we want to make sure that any time Alice is matched with Bob, Bob also is matched with Alice using the same matched_users
API. In order to do this, we'll add an after_create
and an after_destroy
callback to the Match
model. Any time a match is added or removed, we'll create or destroy an inverse record, respectively:
# app/models/match.rb
class Match < ActiveRecord::Base
belongs_to :user
belongs_to :matched_user, class_name: "User"
after_create :create_inverse, unless: :has_inverse?
after_destroy :destroy_inverses, if: :has_inverse?
def create_inverse
self.class.create(inverse_match_options)
end
def destroy_inverses
inverses.destroy_all
end
def has_inverse?
self.class.exists?(inverse_match_options)
end
def inverses
self.class.where(inverse_match_options)
end
def inverse_match_options
{ matched_user_id: user_id, user_id: matched_user_id }
end
end
An inverse match is simply a match record where the user_id
and matched_user_id
are flipped, so that when we look up matches
for Bob, we will be able to find matches with his id
as the matched_users
foreign key. In order to be thorough and conservative with our database records, we make sure we only create an inverse if one doesn't already exist, and we'll destroy all inverses that may have been created. Now, if we try adding Bob to Alice again, we'll see that they both have each other as matches:
alice.matched_users << bob
alice.matched_users # => [bob]
bob.matched_users # => [alice]
Awesome, this is exactly what we want. But wait. Let's make sure that these stay in sync if we remove Bob from Alice's matched users:
alice.matched_users # => [bob]
alice.matched_users.destroy_all # => [bob]
alice.matched_users # => []
bob.matched_users # => [alice]
Even though we have an after_destroy
callback set up, Alice is still in Bob's matched users. Here's why:
For has_many, destroy and destroy_all will always call the destroy method of the record(s) being removed so that callbacks are run. However delete and delete_all will either do the deletion according to the strategy specified by the :dependent option, or if no :dependent option is given, then it will follow the default strategy. The default strategy is to do nothing (leave the foreign keys with the parent ids set), except for has_many :through, where the default strategy is delete_all (delete the join records, without running their callbacks).
So in order to make sure we maintain the bi-directional integrity of the association, we need to change the dependent strategy on the User#has_many
association so that it actually calls destroy
when we modify via association methods:
# app/models/user.rb
class User < ActiveRecord::Base
has_many :matches
has_many :matched_users, through: :matches,
dependent: :destroy
end
Keep in mind here that on a has_many :through
association, when destroy
or delete
methods are called, it will always remove the link between the two models, not the models themselves. By adding dependent: :destroy
, we are telling ActiveRecord that we want to make sure callbacks are run whenever we remove an item from the collection. Now if we try again, we should see what we expect:
alice.matched_users # => [bob]
alice.matched_users.destroy_all # => [bob]
alice.matched_users # => []
bob.matched_users # => []
With this setup, I can judiciously run my matching algorithm for a user only when it makes sense to do it (e.g. after they update their profile), and all users' matches will be automatically kept in sync without having to re-run the match algorithm for everyone. All user matches and unmatches will automatically be reciprocated when I make the change on a single user record. So now, instead of a controller that looks like this:
# app/controllers/matches_controller.rb
def index
# takes over 1 second
@matched_users = MatchMaker.matches_for(current_user)
.page(params[:page])
end
We can do something more like this:
# app/controllers/matches_controller.rb
before_action :resync_matches, only: :index
def index
# several orders of magnitude faster
@matched_users = current_user.matched_users
.page(params[:page])
end
private
def resync_matches
# only resync if we have to
if current_user.matches_outdated?
new_matches = MatchMaker.matches_for(current_user)
current_user.matched_users.replace(new_matches)
end
end
This blog was written in parallel with an example Rails project using TDD, so you can clone and experiment with the code yourself.
Happy match-making!