charkost/prosopite

N+1 error for `upsert_all` with `unique_by` in Rails with postgres

Closed this issue · 3 comments

Thanks for your great gem! I have just discovered it and so far it looks like a great alternative to bullet.
This one behaviour is slightly annoying though because my main motivation for using one of rails' upsert_alls is to reduce the number of queries. I think the main problematic part is the unique_by keyword, which triggers some internal lookups which are beyond your control:

       -- N+1 queries detected:
         SELECT a.attnum, a.attname
       FROM pg_attribute a
       WHERE a.attrelid = 119170
       AND a.attnum IN (3)
     
         SELECT a.attnum, a.attname
       FROM pg_attribute a
       WHERE a.attrelid = 119170
       AND a.attnum IN (3,2)
     
         SELECT a.attnum, a.attname
       FROM pg_attribute a
       WHERE a.attrelid = 119170
       AND a.attnum IN (2)

And if I run these queries by themselves I get back the attributes which I passed into the unique_by.

Example (haven't tested this precise code below, but it's more or less a simplified version of what I have):

# table
create_table :posts do |t|
  t.string :author_name
  t.string :title
  t.text :description
  t.index %i[author_name title], unique: true
end

# model
class Post < ActiveRecord::Base
  validates_uniqueness_of :title, scope: :author_name
end

# action
Post.upsert_all(
  [
    { author_name: "Geoff", title: "Foo", description: "bar" },
    { author_name: "Frank", title: "Baz", description: "hello world" },
    # etc...
  ],
  unique_by: %i[author_name title]
)

Would it be possible to exclude any queries to the pg_* tables for postgres users? Or maybe allow users to specify a whitelist for specific tables?

My current workaround is to find the specific line within active_record and add it to the allow_list - but this feels sub-optimal, as obviously this could very easily change:

Prosopite.allow_list = ["active_record/persistence.rb:243"]

Upon a bit more investigation, it appears that the queries which triggered it have the name SCHEMA - so could be a decent way of excluding these?

 # lib/prosopite.rb:176
      ActiveSupport::Notifications.subscribe 'sql.active_record' do |_, _, _, _, data|
        sql = data[:sql]
        p data.slice(:name, :sql)
        # outputs {:name=>"SCHEMA", :sql=>"SELECT a.attnum, a.attname\nFROM pg_attribute a\nWHERE a.attrelid = 119139\nAND a.attnum IN (2)\n"}
        # ...
      end

so maybe you could have

      ActiveSupport::Notifications.subscribe 'sql.active_record' do |_, _, _, _, data|
        sql, name = data[:sql], data[:name]

        if scan? && name != "SCHEMA" && sql.include?('SELECT') && data[:cached].nil?
          # ...
      end

or something to the same effect. Maybe I could open a PR for this?

Hello, thanks for reporting this! I am a bit busy these days but i will check it asap!

Thanks! No worries, I appreciate that. I did open a PR for this in the end - #27 - I tried to keep it pretty light but let me know if there is a different approach for this that you would prefer!