beam-community/ex_machina

Constraint error when attempting to insert list with a build association (with sequence)

josephan opened this issue · 9 comments

Consider a chat application with the following schema (users table has a unique index on emails column):

download

When I want to insert a list of chat_memberships like so:

    chat_room = insert(:chat_room)

    insert_list(
      20,
      :chat_membership,
      chat_room: chat_room,
      user: build(:user)
    )

I get the following error, even thought the user factory has a sequence for the email field:

     ** (Ecto.ConstraintError) constraint error when attempting to insert struct:

         * users_email_index (unique_constraint)

     If you would like to stop this constraint violation from raising an
     exception and instead add it as an error to your changeset, please
     call `unique_constraint/3` on your changeset with the constraint
     `:name` as an option.

     The changeset has not defined any constraint.

However if I setup my data like this instead, it works:

    chat_room = insert(:chat_room)

    users = build_list(20, :user)

    Enum.each(users, fn user ->
      insert(:chat_membership, user: user, chat_room: chat_room)
    end)

Is this the intended behaviour?

I've created a bare minimum phoenix app to reproduce this issue: https://github.com/josephan/ex_machina_sample

Factory file:
https://github.com/josephan/ex_machina_sample/blob/master/test/support/factory.ex

The test file that demonstrates the issue: https://github.com/josephan/ex_machina_sample/blob/master/test/ex_machina_sample_web/sample_test.exs

In this app there is a users table and a chat_rooms table.
These tables have a many to many relationship with chat_memberships table.

Hi @josephan thanks so much for the detailed description along with an app to reproduce the issue! I can't begin to tell you how incredibly helpful that is. I haven't had the time to take a look at the app yet, but I think I know what might be going on. If it's not it, I'll take a look at if for sure when I have few more minutes.

Looking at the description you posted, I think the issue comes from the fact that factories are built eagerly. So when you call build(:user) we're building the %User{} struct immediately. In other words, in the first scenario, this is what is happening:

# this
    insert_list(
      20,
      :chat_membership,
      chat_room: chat_room,
      user: build(:user)
    )

# is equivalent to this
    insert_list(
      20,
      :chat_membership,
      chat_room: chat_room,
      user: %User{name: "some name", email: "emailcausingissues@example.com"}
    )

See how the user struct is built, so you're trying to use the same email for the 20 chat memberships?

In the second case, you're building 20 user structs with 20 different emails (assuming you're using sequences or something like that):

 # this
    users = build_list(20, :user)

    Enum.each(users, fn user ->
      insert(:chat_membership, user: user, chat_room: chat_room)
    end)

# is equivalent to this
  users = [%User{name: "john", email: "1@example.com"}, %User{name: "john", email: "2@example.com"}, ..., %User[name: "john", email: "20@example.com"}]

  Enum.each(users, fn user -> 
     insert(:chat_membership, user: user, chat_room: chat_room)
  end

In the second case, you're building 20 different users, so creating the chat memberships doesn't cause a problem. Does that make sense?

@germsvel: while you've articulated what the cause of this issue is, can you suggest a solution beyond passing every association into a factory as a pre-created object? This behaviour is not what I expected when I started using ExMachina for factories.

Are there any updates or suggested fixes on this issue?

Currently there's no other way for this to work. build/2 is just an Elixir function that is evaluated when called. There's very little magic happening. That why these are the same:

# this
insert_list(
  20,
  :chat_membership,
  user: build(:user)
)

# is equivalent to this
user = build(:user)
insert_list(20, :chat_membership, user: user)

The build(:user) will always get evaluated before insert_list/3 gets called.

That being said, I've been considering adding an alternative like build_lazy/2 (still trying to find a good name) that would either capture the building in an anonymous function or create a struct that represents the factory that is to be built. I did a spike in this branch to see how it would work. Right now, that branch would let you do something like this:

insert_list(20, :chat_membership, chat_room: chat_room, user: build_lazy(:user))

The user struct would get built at the time when we're creating each of the 20 chat memberships -- so it would generate a different user for each chat membership.

What do you all think?

@sgerrand @ckoch-cars , I just opened #402. I'd love to hear your thoughts on it. I think it solves the problem we run into here.

Thanks @germsvel! I'll review it shortly.

Opened a PR for this work. Let me know what you think -> #406

I'll go ahead and close this issue since I believe it should be resolved by #408.

We should now be able to fix this by passing a function to delay the evaluation of user:

# instead of doing this
insert_list(
  20,
  :chat_membership,
  user: build(:user)
)

# we can now do this
insert_list(
  20,
  :chat_membership,
  user: fn -> build(:user) end
)

I think I've hit an edge case on this where the self-referencing aspect of this lazy doesn't work as expected for a full schema. e.g.

def post_factory do
  %Post{
    author: build(:user),
    contributors: fn post -> [post.author] end
  }
end

this seems to try to insert the user twice