- Use SQLAlchemy to join tables with one-to-one, one-to-many, and many-to-many relationships.
In the previous lesson, we saw how to create a one-to-many association
between two models using SQLAlchemy by following certain naming conventions,
use of the relationship()
and backref()
methods, and using the right
foreign key on our tables when generating the migrations.
In the SQL section, we learned about one other kind of relationship: the many-to-many, also known as the has many through, relationship. For instance, in a domain where a cat has many owners and an owner can also have many cats, we needed to create another table to join between those two tables:
In this lesson, we'll learn how to create a many-to-many relationship in SQLAlchemy. We'll continue working on our games and reviews domain, but this time we'll add a third model into the mix: a users model. We'll be setting up these relationships:
- A game has many reviews
- A game has many users, through reviews
- A review belongs to a game
- A review belongs to a user
- A user has many reviews
- A user has many games, through reviews
Once we're done setting up the database tables, here's what the ERD will look like:
To get started, run pipenv install && pipenv shell
, then follow along with
the code.
Right now, we've got code for the Game
model (and the games
table), along
with the code for the Review
model (and the reviews
table) from the previous
lesson.
To start, let's add the code we'll need for the User
model as well. Let's
create the users
table with a name
column and timestamps:
# models.py
class User(Base):
__tablename__ = 'users'
id = Column(Integer(), primary_key=True)
name = Column(String())
created_at = Column(DateTime(), server_default=func.now())
updated_at = Column(DateTime(), onupdate=func.now())
# don't forget your __repr__()!
def __repr__(self):
return f'User(id={self.id}, ' + \
f'name={self.name})'
Note that for our models' timestamps, we are using some new arguments and values:
server_default
tells the database schema to set a value from the database itself. Since the database is kept in one central location, assigning it the work of creating default values means that we don't have to worry about the quality of our developers' or users' computers.onupdate
means exactly what it says: when the record is updated, the column value is set.- We saw
func
briefly in the previous module; it allows us to use SQL operations instead of their Python counterparts. This benefits us for the same reasons asserver_default
.func.now()
is equivalent to the current time.
We'll also need to modify the reviews
table and add a foreign key to refer to
our users
table. Remember, each review now belongs to a specific user. Any
time we create a belongs to relationship, we need a foreign key to establish
this relationship:
# models.py
class User(Base):
# tablename, columns
reviews = relationship('Review', backref=backref('user'))
# __repr__()
Let's also edit the Review
model to add our new foreign key:
# models.py
class Review(Base):
# tablename, columns
user_id = Column(Integer(), ForeignKey('users.id'))
# __repr__()
Now run alembic revision --autogenerate -m'Add User model'
from the
many-to-many
directory to make our migration. If all goes well, run
alembic upgrade head
to push your migrations to the database.
$ alembic upgrade head
# => INFO [alembic.runtime.migration] Context impl SQLiteImpl.
# => INFO [alembic.runtime.migration] Will assume non-transactional DDL.
# => INFO [alembic.runtime.migration] Running upgrade -> 9e396fc70825, Add User model
Run the first seed file as well to populate the games
and reviews
tables:
$ python seed.py
There are several ways to approach a many-to-many relationship in SQLAlchemy.
All require some sort of intermediary between the two models. We could do this
using the Review
model, but this isn't the best choice. What if we wanted to
add functionality in the future where users could log their games without adding
reviews? In any case, it's unlikely that reviews will be the only things tying
users and games to one another.
Many-to-many relationships in SQLAlchemy use intermediaries called association
tables. These are tables that exist only to join two related tables together.
This might sound like a waste at first, but keeping the foreign keys,
relationships, and backref
s confined to this one table allows us to freely
make changes to our related tables later on without worrying too much about how
they might affect each other.
There are two approaches to building these associations: association objects,
which are most similar to the models we've built so far, and the more common
approach, Table
objects.
An association object is really just another model, so let's add one into
models.py
. The convention is to call this model a combination of the related
models' names, so let's call ours GameUser
.
# models.py
class GameUser(Base):
__tablename__ = "game_users"
game_id = Column(ForeignKey('games.id'), primary_key=True)
user_id = Column(ForeignKey('users.id'), primary_key=True)
game = relationship('Game', backref=backref('users'))
user = relationship('User', backref=backref('games'))
def __repr__(self):
return f'GameUser(game_id={self.game_id}, ' + \
f'user_id={self.user_id})'
An association object should typically use a combination of the related models' primary keys as its own primary key. This allows us to keep the field unique while also making it clear which two records its records point to.
Next, we use the relationship()
method to connect to both the Game
and
User
models. Note that we refer to the other related table in our backref()
instead of game_users
. We typically choose to hide association tables while
our our application runs; this reference details the two endpoints of the
association rather than the middle.
Finally- don't forget your __repr__
!
Table
objects are instances of the sqlalchemy.Table
class. They function
more or less the same as data models, with the exception of being a little
more compact. This syntax visually de-emphasizes association tables in your
models and is the preferred approach in SQLAlchemy.
Let's build the same association table as above with our new syntax:
# models.py
game_user = Table(
'game_users',
Base.metadata,
Column('game_id', ForeignKey('games.id'), primary_key=True),
Column('user_id', ForeignKey('users.id'), primary_key=True),
extend_existing=True,
)
class Game(Base):
# tablename, columns
users = relationship('User', secondary=game_user, back_populates='games')
# __repr__()
class User(Base):
# tablenames, columns
games = relationship('Game', secondary=game_user, back_populates='users')
# __repr__()
Because we are creating an object that is being used in subsequent code,
association tables made from the Table
class must be defined above other data
models. Otherwise, they wouldn't exist when they were referenced!
A few more notes on this approach:
- The
Game
andUser
models each require a relationship with the other. - The
secondary
argument refers to the intermediary table in a many-to-many relationship. - The
back_populates
operates similiarly tobackref
, with the exception that it must be used on both sides of a relationship. Because many-to-many relationships are symmetrical, use ofback_populates
in both models is the best way to leave readable code behind for other developers.
Run alembic revision --autogenerate -m'Add game_user Association Table'
, then
alembic upgrade head
. You can use the script in app/seed_2.py
to generate new
data and interact with your database through the Python shell. To create
relationships between Game
records and User
records, run the second seed
file with python seed_2.py
.
This will add a Game
record to a User
record's games
if the user has
logged a review for the game. When the change is committed, SQLAlchemy also
builds the relationship in reverse, adding the User
record to the Game
record's users
!
We all know that users can have games without reviewing them. That being said,
our application might not allow users to claim ownership of a game without
posting a review! If this is the case, we can consider skipping a join table
entirely and use Review
to join User
and Game
. This syntax is a bit more
complicated, but you might find it useful in certain situations (perhaps like
the Phase 3 Code Challenge?)
An association object functioning as a traditional data model looks like a combination of the two, with some key differences:
NOTE: Remember to delete your
Table
object before trying this out!
# models.py
# intended as example only.
# downgrade to remove game_users before trying this out.
class Game(Base):
# tablename, columns
users = relationship(
'User',
secondary='reviews',
viewonly=True,
backref='games')
# __repr__()
class User(Base):
# no need for a relationship here!
# back_populates() is one-way, but backref() takes care of both in
# the Game model.
class Review(Base):
__tablename__ = 'reviews'
id = Column(Integer(), primary_key=True)
score = Column(Integer())
comment = Column(String())
created_at = Column(DateTime(), server_default=func.now())
updated_at = Column(DateTime(), onupdate=func.now())
game_id = Column(Integer(), ForeignKey(Game.id))
game = relationship('Game', backref=backref('reviews'))
user_id = Column(Integer(), ForeignKey(User.id))
user = relationship(
'User', backref=backref('reviews')
)
def __repr__(self):
return f'Review(id={self.id}, ' + \
f'score={self.score}, ' + \
f'game_id={self.game_id})'
In this instance, we used backref()
in place of back_populates
. It is
often more clear to use back_populates
because it explicitly declares a
relationship on both sides, but backref()
populates both sides of the
relationship and saves us a bit of code.
The Review
model has gotten rather big here, but it's doing more than enough
to justify it:
game_id
creates a relationship between reviews and games.game
creates a game object that belongs to the review.user_id
creates a relationship between reviews and users.user
creates a user object that belongs to the review.
Because of those object relationships, the relationship()
call in the Game
model is able to skip over the Review
model and directly link games and users.
The power of SQLAlchemy all boils down to understanding database relationships and making use of the correct classes and methods. By leveraging "convention over configuration", we're able to quickly set up complex associations between multiple models with just a few lines of code.
The one-to-many and many-to-many relationships are the most common when working with relational databases. You can apply the same concepts and code we used in this lesson to any number of different domains, for example:
Driver -< Ride >- Passenger
Doctor -< Appointment >- Patient
Actor -< Character >- Movie
The code required to set up these relationships would look very similar to the code we wrote in this lesson.
By understanding the conventions SQLAlchemy expects you to follow, and how the underlying database relationships work, you have the ability to model all kinds of complex, real-world concepts in your code!
# many_to_many/models.py
from sqlalchemy import create_engine, func
from sqlalchemy import ForeignKey, Table, Column, Integer, String, DateTime
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine('sqlite:///many_to_many.db')
Base = declarative_base()
game_user = Table(
'game_users',
Base.metadata,
Column('game_id', ForeignKey('games.id'), primary_key=True),
Column('user_id', ForeignKey('users.id'), primary_key=True),
extend_existing=True,
)
class Game(Base):
__tablename__ = 'games'
id = Column(Integer(), primary_key=True)
title = Column(String())
genre = Column(String())
platform = Column(String())
price = Column(Integer())
created_at = Column(DateTime(), server_default=func.now())
updated_at = Column(DateTime(), onupdate=func.now())
users = relationship('User', secondary=game_user, back_populates='games')
reviews = relationship('Review', backref=backref('game'))
def __repr__(self):
return f'Game(id={self.id}, ' + \
f'title={self.title}, ' + \
f'platform={self.platform})'
class User(Base):
__tablename__ = 'users'
id = Column(Integer(), primary_key=True)
name = Column(String())
created_at = Column(DateTime(), server_default=func.now())
updated_at = Column(DateTime(), onupdate=func.now())
games = relationship('Game', secondary=game_user, back_populates='users')
reviews = relationship('Review', backref=backref('user'))
def __repr__(self):
return f'User(id={self.id}, ' + \
f'name={self.name})'
class Review(Base):
__tablename__ = 'reviews'
id = Column(Integer(), primary_key=True)
score = Column(Integer())
comment = Column(String())
created_at = Column(DateTime(), server_default=func.now())
updated_at = Column(DateTime(), onupdate=func.now())
game_id = Column(Integer(), ForeignKey('games.id'))
user_id = Column(Integer(), ForeignKey('users.id'))
def __repr__(self):
return f'Review(id={self.id}, ' + \
f'score={self.score}, ' + \
f'game_id={self.game_id})'