magnusvk/counter_culture

Incorrect Count After Asynchronous Concurrent Updates When Using execute_after_commit

Closed this issue · 7 comments

When enqueueing multiple background jobs (e.g. Sidekiq) which utilize counter_culture's execute_after_commit, the final count will be incorrect (lower than expected).

Here is an over simplification of the setup:

class Campaign < ApplicationRecord
  has_many :participants, dependent: :destroy
end

class Participant < ApplicationRecord
  belongs_to :campaign
  counter_culture :campaign, execute_after_commit: true
end

class DeleteParticipantsJob
  include Sidekiq::Worker
  
  def perform(campaign_id, participant_ids)
    campaign = Campaign.find(campaign_id)
    participant_ids.each do |participant_id|
      campaign.participants.find(participant_id)&.destroy
    end
  end
end

I'm running 2 Sidekiq workers, each with 10 threads. When enqueuing multiple DeleteParticipantsJobs, each with the same or overlapping set of participant_ids, the final count on the Campaign will often be lower than expected (or negative in the case where all participants are deleted).

This issue is not present when not using execute_after_commit.

I suspect that if multiple threads attempt to find and destroy the same participant that counter_culture is executing the after commit count update even if the participant was not actually destroyed by the thread in question (e.g. the record was already destroyed, no SQL DELETE was executed, yet counter_culture still executed the after_commit callback).

I believe that I have isolated this issue counter_culture's the use of after_create, after_update and before_destroy callback methods regardless of whether the execute_after_commit option is present. I believe that when the execute_after_commit option is present, counter_culture should use after_commit callback methods to ensure that the callback is not executed multiple times.

@magnusvk I have a draft fix in a fork. It's not yet ready for me to submit as a pull request, but I would love if you could check it out and provide your initial feedback (am I on to something here, or way off in the weeds?). Thanks!

Interesting! Honestly I have no idea if your fix in the fork works—but if it does, great! One could argue that’s a better way to do this anyhow, so if you can show that works and it keeps all the other tests green, then 👌🏻

My other reaction here is just—if you do things outside of the transaction it’s not going to be 100% safe. But that’s not to say we shouldn’t fix it if we can! Let me know what you find.

@magnusvk Thanks for your feedback.

The feature branch in the fork linked above does solve the issue of incorrect counts; however, when using counter_culture's support for Paranoia and Discard, the counts recorded by soft deleting & restoring can still be incorrect. This is because those respective gems do not utilize after_commit callbacks.

Given I'm also using Discard in a few places, it looks like I'll have to solve this myself with some pessimistic locking in my Sidekiq job.

While I still think that solving this in counter_culture would be the best approach, I'm not sure if a half-baked solution is better than no solution. Your thoughts on whether you want a pull request for the above?

I think it's still an improvement to use the after_commit callbacks, right? If it solves the issue unless you're using Paranoia or Discard, that'll still apply to a large chunk of users. So I'd welcome a PR for that change. I don't think we can add a test that verifies this fixes the concurrency issue, but even without that I'd be happy to merge as long as all the other tests are green.

@jbritten any follow-up here, did you want to open a PR with those changes?

@magnusvk sorry for letting this fall through the cracks. I've just open PR #325.

Closing for inactivity.