drizzle-team/drizzle-orm

[FEATURE]: Support Polymorphic Association

leonard-henriquez opened this issue ยท 43 comments

Describe what you want

I'm looking for a Typesafe ORM that support for polymorphic associations.

To give a concrete example:
I have a Comment model. I need this model to be associated with both Article and Photo models.
Instead of creating separate tables or associations for comments on articles and comments on photos, I would prefer a unified approach where a single Comment can belong to either an Article or a Photo.

Most ORMs (and query builders) except Prisma support polymorphic association.
Prisma is not taking interest in solving this issue: prisma/prisma#1644.
If you plan on implementing Prisma, that would be a killer feature that would be a reason in itself to use Drizzle.
A few teams have mentioned it explicitly in that issue.

Do you plan to support and document this feature in the near future ?

Is this a duplicate of #207?

@luxaritas yes it is ! However I didn't find it because the title is not accurate and the issue doesn't have a description.
@dankochetov Could you update the original issue (#207) so we can close this one ?

I worked around this by having nullable FKs in the Comment table for Article and Photo, say an article_id and a photo_id.
Of course you need to be mindful when you're doing your joins but it works pretty good.
Your Comments relation might look like this:

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.user_id],
    references: [users.id]
  }),
  article: one(articles, {
    fields: [comments.article_id],
    references: [articles.id]
  }),
  photo: one(photos, {
    fields: [comments.photo_id],
    references: [photos.id]
  })
}));

I'm curious, how would you do polymorphic association in plain SQL?

The best approach I've seen which maintains the ability to have foreign keys is by adding a "commentable' table. That is, article and photo both have a foreign key (1:1 relation) to commentable, and comment has a foreign key (many:1) relation to commentable.

I see what you mean, Comment would have a not null FK. But in the approach I used, I never lost the ability to have foreign keys, even with on update and on delete. It's just that Comments would have several FKs columns, one for each of the other tables that needs comments.

Yep, that's true - though the more possible tables you add, the sparser the table becomes - not necessarily desirable (namely in terms of data modeling/efficiency/ensuring integrity given the constraint exactly one must be not-null)

Would it be possible to do something like Rails Polymorphic operations where you could have a type? https://guides.rubyonrails.org/association_basics.html#polymorphic-associations.

Would the configuration of the column type need to be on the one function call of the relation and just take multiple tables?

CanRau commented

Just now realized that Drizzle doesn't yet seem to have first class support for polymorphism, while trying to get my schema going.

I also have a commenting table which references to more than just 2 other tables so defining nullable fields for each would be really messy, also I have other tables like taggin (for tags) and likes which are polymophic.

The question I have for this feature request is what does it look like? I understand at the high level what is needed but I don't see what needs to happen in drizzle to support it?
I have something similar working in one of my projects and I mentioned before how I did it.
I also took a look at the way that was implemented in ruby on rails, and I spun up a quick db to test. See how I implemented the same table to hold comments for 3 different other tables:

// User section
export const users = mysqlTable("users", {
  id: int("id").autoincrement().primaryKey(),
  name: text("name").notNull(),
  createdAt: timestamp("created_at", { fsp: 3, mode: "string" }).default(
    sql`current_timestamp(3)`,
  ),
  commentType: varchar("comment_type", { length: 256 }).default(sql`'user'`),
});

export const usersRelations = relations(users, ({ many, one }) => ({
  comments: many(comments),
}));

// Country section
export const countries = mysqlTable(
  "countries",
  {
    id: int("id").autoincrement().primaryKey(),
    name: varchar("name", { length: 256 }),
    createdAt: datetime("created_at", { fsp: 3 }).default(
      sql`current_timestamp(3)`,
    ),
    commentType: varchar("comment_type", { length: 256 }).default(
      sql`'country'`,
    ),
  }
);

export const countriesRelations = relations(countries, ({ many }) => ({
  comments: many(comments),
}));

// City section
export const cities = mysqlTable("cities", {
  id: int("id").autoincrement().primaryKey(),
  name: varchar("name", { length: 256 }),
  createdAt: datetime("created_at", { fsp: 3 }).default(
    sql`current_timestamp(3)`,
  ),
  commentType: varchar("comment_type", { length: 256 }).default(sql`'city'`),
});

export const citiesRelations = relations(cities, ({ many }) => ({
  comments: many(comments),
}));

// The important part the comments. Notice how the commentableType is a enum, will have intelisense
export const comments = mysqlTable("comments", {
  id: int("id").autoincrement().primaryKey(),
  content: text("content").notNull(),
  commentableId: int("commentable_id").notNull(),
  commentableType: mysqlEnum("commentable_type", ["user", "city", "country"]),
});

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.commentableId, comments.commentableType],
    references: [users.id, users.commentType],
  }),
  city: one(cities, {
    fields: [comments.commentableId, comments.commentableType],
    references: [cities.id, cities.commentType],
  }),
  country: one(countries, {
    fields: [comments.commentableId, comments.commentableType],
    references: [countries.id, countries.commentType],
  }),
}));

I tested insertion, selection and update, everything works as expected.
When inserting comments, your LSP will suggest the commentableType, and when selecting, you can use the commentableType to select the comments from whatever table you want. For example, the following query will return a user with it's comments:

const userWithComments = await db.query.users.findFirst({
  where: eq(users.id, 1),
  with: {
    comments: true,
  },
});

Both ways of doing it will have it's pros and cons, using this method, your comments table will be compact, but it won't have database level referential integrity. Doing it the other way, you table will be more sparse, but you keep referential integrity.

If somebody knows a way to have different foreign keys on the same column, referencing different tables, I would like to know.

My previous comment as well as my comments in #207 describe an alternative that maintains referential integrity without a sparse table, with the limitation of adding additional intermediate table(s)

I didn't test that, but I also don't see what limitation Drizzle has that prevents anybody from implementing it.

Right - I agree, I'm not sure if there's anything to do (aside from unions for some cases, which is coming)

The only other thing that maybe comes to mind is if the RQB was able to support the type field implicitly at runtime without the additional "static" column

I'm struggling to implement your solution. Because the commantable table does not hold a reference to the acticle (or user, or country, or whatever table is being commented), the onDelete would only work if the commentable table is deleted, and we want it all the way around.
True 1:1 tables share a common id, their primary key. You can't have it like that in this case because of the attempt on polymorphism.
Another issue is that you could end up with two different tables pointing to the same commentable row. There's no way to enforce uniqueness because commentable doesn't hold a FK from the commented tables.
Lastly, there is no straightforward way to select comments from one particular entity, say a user. Maybe:
SELECT content from comments where commentableId in (select commentableId from users where id = XXX)

Can you put together a quick example? I'll give another try tomorrow.

I agree that the design you proposed should work and maintain referential integrity @luxaritas. I also think that the polymorphic relationship from Rails can be nicer if you're willing to give up referential integrity.

@Angelelz it seems like your code in #1051 (comment) shows that Drizzle already supports polymorphic associations so maybe we can close this issue? I'm just curious why users for example need a commentType field?

I'm just curious why users for example need a commentType field?

It is not necessary for this to work, but it's a nice way to let drizzle know what comments belong to what table.
This allows Drizzle to return only the comments when you do:

const userComments = await db.query.comments.findMany({
  where: eq(comments.commentType, "user"),
})

This will not be database level reference, but a Drizzle reference.
The column in the users table allows you to define the relation with two references in Drizzle:

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.commentableId, comments.commentableType],
    references: [users.id, users.commentType],
  }),
});

When Drizzle adds support for generated columns, this could be generated and you can just forget about it.

Is this because the cardinality of fields needs to match references?

Could I do this instead?

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.commentableId, "user"],
    references: [users.id],
  }),
});

Is this because the cardinality of fields needs to match references?

Yes. And I tried passing a string but the API currently only accepts a column.

Ok, so it seems like most of the opportunity here is making that API more ergonomic and possibly defining some sort of type safety going from comment to commentable?

I do not believe a change to the API would be necessary, what would this allow?

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.commentableId, "user"],
    references: [users.id],
  }),
});

You can just get rid of the "user", and it would work exactly the same way. You would still need to insert the commantableType: "user" | "country" | "city" as an enum every time you insert a new comment. It would be the only way to identify what type of comment it would be.

I assume you need the commentableType in the fields to make many work. Otherwise how would Drizzle know which comments to select when going from user to comments?

CanRau commented

amazing workaround @Angelelz, I'm possibly missing something here, though my polymorphic tables let's say commenting in this case already has a reference to commentableType & commentableId, so kind of the reverse of what you're showing, do I still need to also have users.commentType? Because this would be the same on all users it feel pretty redundant right?

Also just in case, I actually don't care about referential integrity as Vitess doesn't support foreign keys anyways, which is what PlanetScale uses under the hood

@CanRau I think that's the question chain from above. It seems like right now users.commentType is required but I have yet to test it myself. If it is required, I think there is an opportunity to change the API to make it not required.

It's definitely not required. I'm starting to think it's actually kinda redundant. My idea to use it was to try and actually have referential integrity, but in all the tests I've done I haven't been able to make it work. I'll report again if I have some more ideas. I am yet to try @luxaritas approach successfully.

How would it add referential integrity?

CanRau commented

Still wrapping my head around drizzle, how would I go about @Angelelz example without user.commentType? ๐Ÿ˜…

@pspeter3 you suggest this would require an API change though from @Angelelz comment after it sounds like it's already possible without API changes?

How would it add referential integrity?

It wouldn't. Like I said, I was investigating if I could find a way to do it.

how would I go about @Angelelz example without user.commentType?

No API changes are necessary. this is the implementation without user.commentType:

// Country section
export const countries = mysqlTable(
  "countries",
  {
    id: int("id").autoincrement().primaryKey(),
    name: varchar("name", { length: 256 }),
    createdAt: datetime("created_at", { fsp: 3 }).default(
      sql`current_timestamp(3)`,
    ),
  }
);

export const countriesRelations = relations(countries, ({ many }) => ({
  comments: many(comments),
}));

// City section
export const cities = mysqlTable(
  "cities",
  {
    id: int("id").autoincrement().primaryKey(),
    name: varchar("name", { length: 256 }),
    createdAt: datetime("created_at", { fsp: 3 }).default(
      sql`current_timestamp(3)`,
    ),
  },
);

export const citiesRelations = relations(cities, ({ many }) => ({
  comments: many(comments),
}));

// User section
export const users = mysqlTable(
  "users",
  {
    id: int("id").autoincrement().primaryKey(),
    name: text("name").notNull(),
    createdAt: timestamp("created_at", { fsp: 3, mode: "string" }).default(
      sql`current_timestamp(3)`,
    ),
  },
);

export const usersRelations = relations(users, ({ many, one }) => ({
  comments: many(comments),
}));

// Comments section
export const comments = mysqlTable(
  "comments",
  {
    id: int("id").autoincrement().primaryKey(),
    content: text("content").notNull(),
    commentableId: int("commentable_id").notNull(),
    commentType: mysqlEnum("comment_type", ["user", "city", "country"]),
  }
);

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.commentableId],
    references: [users.id],
  }),
  city: one(cities, {
    fields: [comments.commentableId],
    references: [cities.id],
  }),
  country: one(countries, {
    fields: [comments.commentableId],
    references: [countries.id],
  }),
}));

The only downside of doing it without the commentType on the commentable tables is that you would now need an extra where condition for the commentType on the comment table:

const usersWithComments = await db.query.users.findFirst({
  where: eq(users.id, 1),
  with: {
    comments: {
      where: eq(comments.commentType, "user"),
    },
  },
});

If you don't include that where, you will fetch comments from cities and countries that have id of 1.
I think adding the commentType is a cleaner solution IMO.

CanRau commented

Uuuh I see, of course makes sense, having to do this on every query wouldn't be optimal though hooks could solve that #1426 ๐Ÿ˜ƒ

I had a nice discussion with @dankochetov and him and the team think the best way to do this would be with an extra where in the relation definition. The API would look like this:

export const usersRelations = relations(users, ({ many }) => ({
  comments: many(comments, {
    where: eq(comments.commentType, 'user')
  }),
}));

See this comment for context.

Oh interesting! That's the kind of API change I was thinking about. The other idea I had was:

export const commentRelations = relations(comments, ({ one }) => ({
  user: one(users, {
    fields: [comments.commentableType, comments.commentableId],
    references: ["users", users.id],
  }),
});

We discussed that, but the where would address polymorphic associations and also other issues at the same time. And it's a cleaner API IMO

Awesome! It seems clearer to me too. Thanks for discussing it with the team

I'd like to suggest an alternative way of modelling polymorphic relationship with Drizzle, which works great in Postgres :

Suppose you'd want a Like entity to belong to either a picture, a video or a song.

You would want to have the following check on the Table :

ALTER TABLE "Like"
  ADD CONSTRAINT check_like_polymorphism
  CHECK (num_nonnulls("pictureId", "videoId", "songId") = 1);

Then in the model use nullables ids (and indexes if you want) :

export const like = pgTable(
  'Like',
  {
    id: text('id').primaryKey().default(uuid).notNull(),
    createdAt: timestamp('createdAt', { precision: 3, mode: 'string' }).defaultNow().notNull(),
    // Polymorphic Relationship
    pictureId: text('pictureId').references(() => picture.id),
    videoId: text('videoId').references(() => video.id),
    songId: text('songId').references(() => song.id)
  },
  (table) => {
    type Column = keyof typeof table;
    const notNull = (column: Column) => sql`${table[column]} IS NOT NULL`;
    const nameIndex = <N extends Column>(column: N) => `Like_${column}_idx` as const;
    return {
      pictureIdx: index(nameIndex('pictureId')).on(table.pictureId).where(notNull('pictureId')),
      videoIdx: index(nameIndex('videoId')).on(table.videoId).where(notNull('videoId')),
      songIdx: index(nameIndex('songId')).on(table.songId).where(notNull('songId'))
    };
  }
);

This would enforce that only one of these 3 columns have a value while the other 2 must be null.
Unfortunately this isn't supported by drizzle-kit, so you need to manually edit the migrations to add the checks and the where in the index:

Overall in my opinion it's a much nicer way to deal with polymorphic relationship than the way most ORMs handle them.
I'm not sure if this works outside of Postgres though.

kenn commented

I'm currently using this with drizzle to model polymorphic relationship, which works great in Postgres :

ALTER TABLE "Like"
  ADD CONSTRAINT check_like_polymorphism
  CHECK (num_nonnulls("pictureId", "videoId", "songId") = 1);

Many DBs such as Postgres or MySQL indexes NULL, so it takes more storage and memory pressure compared to the fixed two elements (type, id) tuple without NULLs. It gets worse by O(N) as you add more reference types. I would stick with the best practice of (type, id) tuple.

I'm currently using this with drizzle to model polymorphic relationship, which works great in Postgres :

ALTER TABLE "Like"
  ADD CONSTRAINT check_like_polymorphism
  CHECK (num_nonnulls("pictureId", "videoId", "songId") = 1);

Many DBs such as Postgres or MySQL indexes NULL, so it takes more storage and memory pressure compared to the fixed two elements (type, id) tuple without NULLs. It gets worse by O(N) as you add more reference types. I would stick with the best practice of (type, id) tuple.

It's a tradeoff. Do you want referential integrity or do you want to have a few nullable columns in your table ?
I would argue that having a solution that works outside of the ORM is worth it, but by all means chose what makes the most sense for your application.

AFAIK having a null FK doesn't affect execution time.

Edit : It's actually possible to add referential integrity to the (type, id) solution. There's some more potential solutions in this article https://www.cybertec-postgresql.com/en/conditional-foreign-keys-polymorphism-in-sql/

Not an advertisement, but Sequelize's document support three types of polymorphic association. Maybe Drizzle Team can take a look for reference? I believe Single-model, single-foreign-key polymorphic associations is the more common way to achieve it using a type field and an id field

I had a nice discussion with @dankochetov and him and the team think the best way to do this would be with an extra where in the relation definition. The API would look like this:

export const usersRelations = relations(users, ({ many }) => ({
  comments: many(comments, {
    where: eq(comments.commentType, 'user')
  }),
}));

See this comment for context.

Longtime Rails nerd here evaluating this library. This is the most straightforward way to do it.

Coming from Rails also, and I loved reading this conversation. If you build it we will come, and... here we are, so I guess you built it. ๐Ÿ‘ ๐Ÿ’ฏ
I need to dig into the relations API.

I'll just add that the documentation currently has zero hits for "polymorph*", and this is the term people are going to be looking for, and this is a crutch feature that many potential users, especially coming from Rails, will hope to see and keep looking elsewhere if they don't see it.

Also, why is this issue still open? Is there more polymorphin' power rangers work coming?

I had a nice discussion with @dankochetov and him and the team think the best way to do this would be with an extra where in the relation definition. The API would look like this:

export const usersRelations = relations(users, ({ many }) => ({
  comments: many(comments, {
    where: eq(comments.commentType, 'user')
  }),
}));

See this comment for context.

Coming from objectionjs and kysely... this is very similar to how those tackled the relationships... they allowed a 'where' to be built into the relationship. Was pretty flexible and would be great to see.

Curious if anyone has a code example of achieving polymorphic models with drizzle or is the above the current best convention?

I am confused because this issue (#207) was closed as completed and it is implied in that issue that "UNION support" is the same as polymorphic association. Do unions somehow enable polymorphic relations?

Any help or links/resources appreciated, but reading through this and a couple other related issues and haven't been able to find a concrete example.

Curious if anyone has a code example of achieving polymorphic models with drizzle or is the above the current best convention?

I am confused because this issue (#207) was closed as completed and it is implied in that issue that "UNION support" is the same as polymorphic association. Do unions somehow enable polymorphic relations?

Any help or links/resources appreciated, but reading through this and a couple other related issues and haven't been able to find a concrete example.

I think the answer to the question of whether that issue has anything to do with polymorphic association is no.

I'm similarly confused because the shade thrown at prisma implies some sort of significant victory, when it's clear the prisma issue is discussing unions as a subproblem of enabling polymorphism.

Polymorphism is a much more popular use case than unions. The set operation use cases mentioned in the drizzle docs are extremely contrived. They're all basically "imagine you had very similar data in two tables for some reason".

I don't mean to disparage anyones hard work. Obviously these set operations are a great feature, but there's a misunderstanding of what most people are asking for.

From what I understand, this will be included in the Relational API v2, which is currently in the early stages of active development.
#2316 (comment)