godaddy/activerecord-delay_touching

Would it be possible to skip touching altogether?

Closed this issue · 4 comments

Thanks for this awesome gem! I'm using it to speed up a CSV import script in a Rails app that does a lot of touching. Without this gem, the script takes about 31 minutes with a particular data set. With the gem, it takes a little less than 18 minutes.

Ideally, touching would be bypassed completely during the import because the touching is only needed to populate a single Postgres tsvector column used for full-text search. If I remove all touching code from my app, the script takes about 8 minutes. Then, populating the search index takes an additional 3 minutes via Location.find_each(&:touch). That's a total of 11 minutes, which is significantly faster.

Since touching is not necessary during the intial data import, I was wondering if there is a way to bypass the touching. Something like ActiveRecord::Base.skip_touching do.

Thoughts?

Hi @monfresh. What version of ActiveRecord are you using? The ability to skip touch altogether is available in core AR as of Rails 4.1, I believe.
http://api.rubyonrails.org/classes/ActiveRecord/NoTouching/ClassMethods.html

@monfresh I have one other thought, for what it's worth. Based on your description above, this might make your batch run a bit faster.

Rather than run Location.find_each(&:touch), which will fire off a sql call for every single record, you could do some slight hackery to have a single SQL call for the actual update, as follows. Do note that I call it a hack because you're tapping into a private AR method. However, as long as you have reasonable insight into your own code here, it might give significant enough performance gains to make it worthwhile.

touch_attributes = Location.first.send(:timestamp_attributes_for_update_in_model)
touch_timestamp = Time.now
update_attributes = Hash[touch_attributes.map {|attr| [attr, touch_timestamp]}]

Location.update_all(update_attributes) #fires single SQL to update all records

Of course, you may want to limit this to only the records that you imported (in your example you're touching all Location records), in which case you could pluck the IDs of the location records and add a where clause to the above update, or some other appropriate filter.

If you want "on touch" callbacks to fire, in case there are dependent AR callbacks watching for Location touch events (i.e. cascading events, which you should avoid if possible if you at all care about performance), then you can iterate through and trigger those callbacks directly.

Location.find_each { |loc| loc.run_callbacks(:touch) }

Just a thought. YMMV 🍻

Thanks a lot, Michael! My search foo was lamentable last night. I should have used Dash to search for "touch", and I would have found the no_touching method. It works beautifully.

As for using update_all, it makes perfect sense since I'm only updating the updated_at field and I don't need validations or callbacks. I don't even need to do touch_attributes = Location.first.send(:timestamp_attributes_for_update_in_model) because updated_at is the only attribute I need to update.

WIth update_all, it takes 2 minutes and 42 seconds versus 3 minutes with find_each(&:touch).

@monfresh, indeed. ActiveRecord lets you specify the timestamp columns but, since this is your own model, you can simply specify the updated_at column in your update. It's less future proof, but much simpler if you're comfortable such things will not change.
Good luck!