crashtech/torque-postgresql

belongs_to_many: don't allow to inlined

Closed this issue · 21 comments

Hi!

Working with the latest version of the gem and rails 6.1.1, I have noticed an issue:

# frozen_string_literal: true
require 'torque-postgresql'

require "active_record"
require "minitest/autorun"
require "logger"

ActiveRecord::Base.establish_connection(
  adapter:  "postgresql",
  database: "test",
  encoding: "unicode",
  host:     "localhost",
  port:     "5432",
  password: "12345",
  username: "test")

ActiveRecord::Schema.define do
  drop_table "employees"
  drop_table "projects"

  create_table "employees" do |t|
    t.string "name"
    t.timestamps
  end

  create_table "projects" do |t|
    t.bigint "employees_ids", array: true
    t.string "title"
    t.timestamps
  end
end

class Employee < ActiveRecord::Base
end

class Project < ActiveRecord::Base
  belongs_to_many :employees, foreign_key: :employees_ids
end


class BugTest < Minitest::Test
  def test_create
    # create project with employee
    employee = Employee.create!
    project = Project.create!(employees: [employee])

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count

    project.reload

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count
  end
end

The employees_ids in this case aren't persisted to the DB, this the query that is running:

INSERT INTO "projects" ("created_at", "updated_at") VALUES ($1, $2) RETURNING "id"

Looks like that concat_recordsis called - https://github.com/crashtech/torque-postgresql/blob/master/lib/torque/postgresql/associations/belongs_to_many_association.rb#L94
result is an array of employees - which is good, but ids_reader returns an empty array.

I found some more issues here,

class BugTest < Minitest::Test
  def test_create_with_objects
    # create project with employee
    employee = Employee.create!

    before_create_ids = []
    before_create_objects = []
    Project.before_create do
      before_create_ids = self.employees_ids.dup
      before_create_objects = self.employees.to_a
    end

    project = Project.create!(employees: [employee])

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count

    assert_equal [employee.id], before_create_ids
    assert_equal 1, before_create_objects.count

    project.reload

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count
  end

  def test_create_with_ids
    # create project with employee
    employee = Employee.create!

    before_create_ids = []
    before_create_objects = []
    Project.before_create do
      before_create_ids = self.employees_ids.dup
      before_create_objects = self.employees.to_a
    end

    project = Project.create!(employees_ids: [employee.id])

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count

    assert_equal [employee.id], before_create_ids
    assert_equal 1, before_create_objects.count

    project.reload

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count
  end
end

In before_create in some cases the ids are present, other cases they are null, and other cases objects self.employees is empty while the ids have content

Thanks for finding these issues. I'm working on improving the belongs_to_many association manipulation.

Hi @yosiat, can you give it a try with the version on master? I rethought the association and rewrite it. Mixing has_many and belongs_to is quite complicated, I was also attempting not to perform multiple updates within operations like concat and replace. The master branch now contains the most refined version. Let me know if you find any other issues so I can add them to the same release.

@crashtech I am trying to run this test - #56 (comment) and it fails for me in master.


F

Failure:
BugTest#test_create_with_ids [/Users/yosi/code/torque-postgresql/torque.rb:85]:
Expected: 1
  Actual: 0


rails test Users/yosi/code/torque-postgresql/torque.rb:68

E

Error:
BugTest#test_create_with_objects:
ActiveModel::MissingAttributeError: can't write unknown attribute `employees_ids`
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activemodel-6.1.1/lib/active_model/attribute.rb:207:in `with_value_from_database'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activemodel-6.1.1/lib/active_model/attribute_set.rb:51:in `write_from_user'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/attribute_methods/write.rb:40:in `_write_attribute'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:25:in `public_send'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:25:in `ids_writer'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:62:in `block in build_changes'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:62:in `tap'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:62:in `build_changes'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:190:in `concat_records'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/associations/collection_association.rb:118:in `concat'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/associations/collection_association.rb:415:in `replace_records'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:186:in `block in replace_records'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:62:in `build_changes'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/belongs_to_many_association.rb:186:in `replace_records'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/associations/collection_association.rb:249:in `replace'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/associations/collection_association.rb:41:in `writer'
    /Users/yosi/code/torque-postgresql/lib/torque/postgresql/associations/builder/belongs_to_many.rb:37:in `employees='
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activemodel-6.1.1/lib/active_model/attribute_assignment.rb:49:in `public_send'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activemodel-6.1.1/lib/active_model/attribute_assignment.rb:49:in `_assign_attribute'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/attribute_assignment.rb:21:in `block in _assign_attributes'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/attribute_assignment.rb:13:in `each'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/attribute_assignment.rb:13:in `_assign_attributes'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activemodel-6.1.1/lib/active_model/attribute_assignment.rb:34:in `assign_attributes'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/core.rb:478:in `initialize'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/inheritance.rb:72:in `new'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/inheritance.rb:72:in `new'
    /Users/yosi/.rvm/gems/ruby-2.7.2/gems/activerecord-6.1.1/lib/active_record/persistence.rb:54:in `create!'
    /Users/yosi/code/torque-postgresql/torque.rb:54:in `test_create_with_objects'

This failure:

ActiveModel::MissingAttributeError: can't write unknown attribute `employees_ids`

Happens because we to _write_attribute a symbol and not a string, there is write_attribute which handles this conversion (from Symbol to String)

Update:
When I change the foreign_key to be string, the tests above works.

But for some reason, in my application when I docreate(:project, employees: [employee]), the associated array is null.
I debugged both cases:

In both places, I am using rails 6.1.1 and torque master (43ab8f9).

Update 2:
Managed to reproduce the above issue, I added relationship to Project which is simple has_many, and now the employees aren't persisted.

ActiveRecord::Schema.define do
  drop_table "employees", if_exists: true
  drop_table "projects", if_exists: true
  drop_table "departments", if_exists: true

  create_table "employees" do |t|
    t.string "name"
    t.timestamps
  end

  create_table "projects" do |t|
    t.bigint "employees_ids", array: true
    t.string "title"
    t.timestamps
  end

  create_table "departments" do |t|
    t.string "name"
    t.integer "project_id"
    t.timestamps
  end

end

class Department < ActiveRecord::Base
  after_initialize :do_something
  belongs_to :project

  def do_something
    pp self.project.employees
  end
end

class Employee < ActiveRecord::Base
end

class Project < ActiveRecord::Base
  belongs_to_many :employees, foreign_key: "employees_ids"
  has_many :departments
end

class BugTest < Minitest::Test
  def test_simple
    employee = Employee.create!
    department = Department.create!

    before_create_ids = []
    before_create_objects = []
    Project.before_save do
      puts "> called"
      before_create_ids = self.employees_ids.dup
      before_create_objects = self.employees.dup
    end

    project = Project.create!(employees: [employee], departments: [department])

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count

    assert_equal [employee.id], before_create_ids
    assert_equal 1, before_create_objects.count

    project.reload

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count
  end

  def test_new
    employee = Employee.create!

    project = Project.new(employees: [employee])
    project.departments << Department.new(project: project)
    project.save

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count

    assert_equal [employee.id], before_create_ids
    assert_equal 1, before_create_objects.count

    project.reload

    assert_equal 1, project.employees.count
    assert_equal 1, project.employees_ids.count
  end
end

Okay, I see how having another association causes the error. I added another test and it should be fixed now. Give it a try with master 13207fd. Sorry for the back and forth 😞

Trying with the latest master, and the tests above pass but the tests in my project don't pass for some reason.
I have an after_initialize in my department model when I comment it out - the employees are persisted to DB, when I have the after_initialize which access project.employees the employees aren't persisted.

I am trying to reproduce this with those tests, but it's hard some else is causing an issue there, I'll investigate it more and update.

The back and forth is totally ok! I really appreciate your help and I feel we have some progress here :)

Okay! Let me know about your findings or if there is any other way I can help. I'll wait for your feedback to push the new version.

Doing some debug session and I am finding some interesting things, I added logs in torque & rails.

What I find, is that in the Department model, I have after_initalize which does project.employees.count, in my app this cause the employees to not be persisted.

Test script logs:


[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=false @stale_state=, stale_state=
[torque] find_target? loaded?=false, foreign_key_present?=false, klass=Employee
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=, stale_state=
[torque] find_target? loaded?=true, foreign_key_present?=false, klass=Employee
[torque] ids_writer ids=, command=write_attribute
[torque] ids_writer ids=, command=write_attribute

[Department] after_initialize called
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=, stale_state=
[Department] after_initialize 0

[toreque] insert_record record_id=1
[torque] ids_writer ids=[1], command=write_attribute
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=, stale_state=[1]
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=, stale_state=[1]
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] reader stagle_target?=true, doing reload
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=false @stale_state=, stale_state=[1]
[torque] find_target? loaded?=false, foreign_key_present?=true, klass=Employee

In my app:

[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=false @stale_state=, stale_state=[]
[torque] find_target? loaded?=false, foreign_key_present?=false, klass=Employee
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=[], stale_state=[]
[torque] find_target? loaded?=true, foreign_key_present?=false, klass=Employee
[torque] ids_writer ids=[], command=write_attribute
[torque] ids_writer ids=, command=write_attribute

[Department] after_initialize new_record?=true
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=[], stale_state=
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=true @stale_state=[], stale_state=
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] reader stagle_target?=true, doing reload
[Torque::PostgreSQL::Associations::BelongsToManyAssociation] stale_target? inversed=false, loaded?=false @stale_state=, stale_state=
[torque] find_target? loaded?=false, foreign_key_present?=false, klass=Employee
[Department] after_initialize employees=0

[torque] ids_writer ids=, command=write_attribute

Then I found the issue!
The difference is between the initial state, in tests script it's null and in my app, it's an empty array.

So, I tried to change the migration in test script from:

  create_table "projects" do |t|
    t.bigint "employees_ids", array: true
    t.string "title"
    t.timestamps
  end

To have , default: [] but it fails on ActiveRecord error (I think it's related to torque columns quoting, I disabled it in my app, I'll open issue for that in the future).

So, to make it fail in the test script I did:

  create_table "projects" do |t|
    t.string "title"
    t.timestamps
  end
execute "alter table projects add column employees_ids integer[] default '{}'::integer[]"

See - reproduced.rb - https://gist.github.com/yosiat/37371424bb12236c0a4bb150bc8533ec#file-reproduced-rb

and the issue is reproduced :)

Found another issue - after I do project.save, I am adding project.update(title: "something")
After this commit - 43ab8f9 , the previous_changes and saved_changes in after_commit are empty (not containing that title is changed) - once I comment out the autosave association (which breaks other things) the changes are restored.

Adding reproduction tests -

# frozen_string_literal: true
require 'torque-postgresql'

require 'byebug'
require "active_record"
require "minitest/autorun"
require "logger"

ActiveRecord::Base.establish_connection(
  adapter:  "postgresql",
  database: "test",
  encoding: "unicode",
  host:     "localhost",
  port:     "5432",
  password: "12345",
  username: "test")

ActiveRecord::Schema.define do
  drop_table "employees", if_exists: true
  drop_table "projects", if_exists: true
  drop_table "departments", if_exists: true

  create_table "employees" do |t|
    t.string "name"
    t.timestamps
  end

  create_table "projects" do |t|
    t.string "title"
    t.timestamps
  end

  execute "alter table projects add column employees_ids integer[] default '{}'::integer[]"

  create_table "departments" do |t|
    t.bigint "employees_ids", array: true
    t.string "title"
    t.timestamps
  end
end


class Employee < ActiveRecord::Base
end

class Project < ActiveRecord::Base
  belongs_to_many :employees, foreign_key: "employees_ids"
end

class Department < ActiveRecord::Base
  belongs_to_many :employees, foreign_key: "employees_ids"
end


class BugTest < Minitest::Test
  def test_project_have_default_of_empty_array
    project = Project.create

    # Access employees here, cause the the after commit to be empty
    pp project.employees.count

    after_commit_previous_changes = nil
    after_commit_saved_changes = nil


    Project.after_commit do
      after_commit_previous_changes = previous_changes.dup
      after_commit_saved_changes = saved_changes.dup
    end

    project.update(title: "hello world")

    assert after_commit_previous_changes&.include?("title"), "after_commit previous_changes includes title"
    assert after_commit_saved_changes&.include?("title"), "after_commit saved_changes includes title"
  end

  def test_department_dont_have_default_of_empty_array
    department = Department.create

    # Access employees here, cause the the after commit to be empty
    pp department.employees.count

    after_commit_previous_changes = nil
    after_commit_saved_changes = nil


    Department.after_commit do
      after_commit_previous_changes = previous_changes.dup
      after_commit_saved_changes = saved_changes.dup
    end

    department.update(title: "hello world")

    assert after_commit_previous_changes&.include?("title"), "after_commit previous_changes includes title"
    assert after_commit_saved_changes&.include?("title"), "after_commit saved_changes includes title"
  end
end

It looks like autosave is running even if no employees is associated to the objects and writing empty array / nil via update_attribute and this causes the changes in after_commit to be empty

@crashtech I fixed the two issues above (with an additional issue, associating in before_save) in a fork of mine, and all my tests pass.

let me know what do you think about it:
master...yosiat:write-ids-only-if-they-changed

If you like, I'll do a pull request and we can go from there.

Hey @yosiat! Just pushed more changes d1858e9.

One thing that got me thinking. Should changing the target be automatically persisted when the record is persisted or not?
For has_many, that is mandatory, because the data is in a different table, so the change must happen to reflect reality.
For belongs_to, the change only happens when the owner record is actually saved.

I can see pros and cons on both sides, people might be more used to and expect it to behave more like has_many.
Currently, it does behaves more like the has_many. But I was thinking about adding a config like "associations.belongs_to_many_persisted_auto_save", so it could behave more like belongs_to. Any thoughts?

Hey @crashtech !

I checked this commit on my project and it fails multiple tests, looks like this is the cause:

# frozen_string_literal: true
require 'torque-postgresql'

require 'byebug'
require "active_record"
require "minitest/autorun"
require "logger"

ActiveRecord::Base.establish_connection(
  adapter:  "postgresql",
  database: "test",
  encoding: "unicode",
  host:     "localhost",
  port:     "5432",
  password: "12345",
  username: "test")

ActiveRecord::Schema.define do
  drop_table "employees", if_exists: true
  drop_table "projects", if_exists: true
  drop_table "departments", if_exists: true

  create_table "employees" do |t|
    t.string "name"
    t.timestamps
  end

  create_table "projects" do |t|
    t.string "title"
    t.timestamps
  end

  execute "alter table projects add column employees_ids integer[] default '{}'::integer[]"
end


class Employee < ActiveRecord::Base
end

class Project < ActiveRecord::Base
  belongs_to_many :employees, foreign_key: "employees_ids"
end

class BugTest < Minitest::Test
  def test_project_have_default_of_empty_array_after_commit_changes
    project = Project.create

    # Access employees here, cause the the after commit to be empty
    assert project.employees.count == 0

    after_commit_previous_changes = nil
    after_commit_saved_changes = nil


    Project.after_commit do
      after_commit_previous_changes = previous_changes.dup
      after_commit_saved_changes = saved_changes.dup
    end

    project.update(title: "hello world")

    assert after_commit_previous_changes&.include?("title"), "after_commit previous_changes includes title"
    assert after_commit_saved_changes&.include?("title"), "after_commit saved_changes includes title"
  end
end

Should changing the target be automatically persisted when the record is persisted or not? - what do you mean by that? can you give some example?

What do you think about the fork changes I wrote above ? it passes all my project tests & torque tests (and the new tests I added).

Btw, why you are using #presence when writing the ids? I saw, a case when I do project.employees = [] it writes employees_ids as null and not empty array (because of #presence method) - which later fails method calls on ids.

Should something like project.employees = FactoryBot.create_list(:employee, 5) when the project is already saved trigger an independent UPDATE, or it should wait till record.save to be persisted? That is the question. Right now, it triggers an independent UPDATE, which is the behavior of has_many, otherwise, it would behave more like belongs_to and wait for the save, before it is persisted.

Yes, I checked your fork, but not always the ids_writer should be called. Especially before concat_records, because there are Rails operations (like checking if the list has valid records) that must run before writing the ids. Also, build_changes is only called when changing the association, so there is no need to check the changes.

This test is pretty much the same as yours, so there might be some other thing in the way.

When there is no record associated, then the collum should be nil not []. It saves database space and it prevents further process to query using an empty array (present? is heavier than nil? and has a different purpose, so null is the right value for empty association).

About the first point, when I think about my app we are trying to do a transition from has_many to belongs_to_many so the natural fit will be autosaving, so the behaviour will be the same. When I think more generally, it makes sense than in other updates other developers will think about it as a column that you need to call .save on later. So a configuration parameter as you suggested + documentation on that matter is good enough IMO.

The difference between the two tests, is that I have a default of empty array I think and in the test, you have a default of null value - which relates to the #presence question. When you have a default of empty array and #presence call in my debugging sessions I did - I saw that build_changesis persisting the update because the value is changed (from nil to []).

I think that treating empty relations as nil have tradeoffs - one side is saving database space & processings but the other side is somewhat ergonomics - for example project.employees return an empty array, while project.employees_ids will return nil which is confusing, in addition on every check on project.employees_ids you need to make sure it's not nil & when you serialize it and the client expects to get an array (with values or empty) you need to do changes their (return empty array if it's nil).

I'll later add the configuration and documentation, I think it is the best approach.

It makes sense that the default value is getting in the way.

Now, regarding the nil. Remember that employees_ids is a column, and the belongs_to_many has little control over it, besides storing the value on it. However, the association does have power over employees and the methods created for it. Therefore, you can rely on employees.ids that will return an empty array (which would not query the DB due to scope.none! when it is nil).

This is the confusion around belongs_to_many. It has to behave as belongs_to (column must be filled or nil) but at the same time operate as has_many (have a collection and an array of values). In order to not create requirement or "forced default values to {}". That could even cause problems with default values like [1], because if it ensures to return [], the following operation would not correctly provide the default value employees_ids || [1].

I would have to change many other things to support []. One of them is this foreign_key_present?.

I understand your perspective and mental model, I'll need to see how I can address this - making this ids a default of null,
since it may cause multiple issues:

  • On serialization, we return these ids - our clients not sure know how to handle null value here
  • In application code, we use _ids method by rails which now may return nil, we can do employees.ids, but I believed that now I can spare some db queries if do .employees_idsinstead of employees.ids
  • If I do something like SomeModel.where(employee_id: project.employee_ids) - in a case where employees_ids is null, it will execute a query against db for finding SomeModel with null employee id, while with empty array it won't execute a query - another point I need to consider.
  • Doing migrate for changing default value to null can cause downtime to our system

About the two points, I can do override for employees_ids and do something like -

def employees_ids
   read_attribute("employee_ids") || []

which will make it easier to change on the system.

anyway, I tested my application with the latest fixes and changing default value to be nil and looks the issues are solved (expect places I didn't fixed yet with ids being nil)

@crashtech I decided eventually to go with the null approach, I still need to figure out how I can handle serialization to return an empty array.

While testing master branch I found some bugs, I fixed them all here - #57

and now all my tests pass, except the serialization tests which is ok.

We need to be very careful with changes like these:
https://github.com/crashtech/torque-postgresql/pull/57/files#diff-e96aa24fc03eee2df359cea33079fc130aae95ec874840dacd3134f1c5b72c37R14
https://github.com/crashtech/torque-postgresql/pull/57/files#diff-e96aa24fc03eee2df359cea33079fc130aae95ec874840dacd3134f1c5b72c37R161

Since this is a hook on an already existing concept, not always the solution is straightforward.
For instance, this problem was happening because the inversion between BelongsToMany <=> HasMany(Array) was not working properly.
And this other one because build_changes was not acting on delete_or_destroy.

I just pushed some fixes. If the specs pass, I will publish a new version.

@crashtech thanks for all the quick replies, fixed and feedback.

I have a PR in my project which passes all tests with the latest gem release, but since _ids have default of nil it has lots of implications to our app, and lots of defensive code was added, employee_ids || [] in all of our app code.

Do we have any option to allow a default of empty array with a configuration toggle? or is it something you are strongly against it?

Hi @yosiat. I spent some time thinking about this and I found a great solution. Give it a try with 4bca9b.

What it does now is relying on the column default value when it becomes empty.

I won't be releasing a new version just with this change, but you can definitely use it from this commit onweards.