mhfs/devise-async

Devise notification queueing not working as expected (sometimes?)

Opened this issue · 13 comments

jjb commented

Here's a situation I am facing. It's very specific to my app, but I'm pretty sure has a root cause in devise-async.

I sometimes remind users who haven't confirmed their email addresses to do so. It goes something like this (some things renamed/simplified):

class User
  def remind
    self.email_confirmation_reminder_count += 1
    send_confirmation_instructions
    save!
  end
end

then in the email template...

- if @user.email_confirmation_reminder_count > 0
  %h6 Complete your account registration.
- else
  %h6 Welcome!

Problem: Sometimes when sending the confirmation instructions in this way, @user.email_confirmation_reminder_count is 0, even though it is incremented in User#remind. Inspecting the object at the console after the email has been sent, the incrementing was indeed successful as expected. So, for some reason the User object that was accessed within the template has the value set to 0, but persisting the incremented value within User#remind was successful.

And as a reminder, this problem only happens sometimes.

A Pretty Good Theory

I've actually seen a similar phenomenon before, and I thought it might be the case here too. If it is the case here, then this is what happens:

  1. User.email_confirmation_reminder_count is incremented on the in-memory object
  2. send_confirmation_instructions creates the background job. The background job is started and finishes before step 3...
  3. The User object, with email_confirmation_reminder_count incremented, is persisted to the DB.

But That Theory Doesn't Hold With devise-async

As you may know, devise-async specifically goes to great lengths to avoid such a situation: https://github.com/mhfs/devise-async/blob/master/lib/devise/async/model.rb

Beginnings of Other Theories

Help Me

If anyone has any ideas or suggestions for what to investigate, it would be much appreciated!

Hopefully when I solve this I can contribute some documentation that will help others!

@jjb When using delayed job it is possible to further delay the running of a job by using something like run_at: -> { 5.minutes.from_now }. Perhaps you can use this to rule out that the job is processed before save! is called on User? But as you already mentioned this should not be the case...

jjb commented

@baschtl Good idea. I haven't gone that route of experimentation yet because I can't reliably reproduce the problem. And also I want to know the cause of the problem before doing a workaround, just in case it's something more serious and/or that affects other things.

jjb commented

I figured it out!

My code uses devise's send_confirmation_token

in send_confirmation_token, there is this code: https://github.com/plataformatec/devise/blob/master/lib/devise/models/confirmable.rb#L97

@raw_confirmation_token is always nil, so a new one is generated, here: https://github.com/plataformatec/devise/blob/master/lib/devise/models/confirmable.rb#L227

Where, as you can see, the user object is persisted (save(validate: false)).

So now, changed? will be false. Meanwhile, back in send_confirmation_instructions: https://github.com/plataformatec/devise/blob/master/lib/devise/models/confirmable.rb#L101, send_devise_notification is invoked. And as we know, it's devise-async's monkeypatched send_devise_notification, which depends on checking changed? to decide if it should enqueue or send right away: https://github.com/mhfs/devise-async/blob/master/lib/devise/async/model.rb#L38

So, that's the problem.

My solution was to add generate_confirmation_token! to my code before incrementing my counter, like so:

class User
  def remind
    generate_confirmation_token! # changes model, then saves, so we're back to changed? == false
    self.email_confirmation_reminder_count += 1 # changes model, leaving changed? == true
    send_confirmation_instructions
    save!
  end
end

I haven't thought about what a more generalized solution for devise-async would be, but it seems like there could be one.

Thoughts?

Did I get this right:

  1. You increment the reminder count (changed? == true).
  2. Then, you call send_confirmation_instructions which generates a new confirmation_token and saves the record (changed? == false).
  3. Then, the actual email sending takes place and devise-async checks for changed? which is in this case always false. Hence, this job is directly forwarded to the corresponding job backend.

What I don't get is:

When in step (3) the record is not changed? anymore your incrementation of the reminder count should already have been persisted, shouldn't it? So, why is the reminder count not as expected in the mailer template?

Another thing: the job backend only gets the id of the record. Hence, it fetches the record again when the job is being processed. Hence, the incrementation of the reminder count should be present in the newly fetched record.

Is this then a concurrency issue (i.e., the database transaction saving the record has not been finished when the mail is sent)? Is this the root cause?

jjb commented

Did I get this right:
...

Yes, but note that your step (3) is actually within your step (2) -- send_confirmation_instructions does the changing, saving, and sending.

When in step (3) the record is not changed? anymore your incrementation of the reminder count should already have been persisted, shouldn't it?

No, because I haven't saved it yet. In my method, the job is queued up in send_confirmation_instructions, and then the record is persisted in the next line, save!. Sometimes, the job runs before save! happens. And that's when the problem happens.

Another thing: the job backend only gets the id of the record. Hence, it fetches the record again when the job is being processed. Hence, the incrementation of the reminder count should be present in the newly fetched record.

I think this is the same as your previous point? If not let me know :)

@jjb Ah, I see my mistake, now. :-)

Hmm, at least you could do

class User
  def remind
    increment!(:email_confirmation_reminder_count)
    send_confirmation_instructions
  end
end

Or is there a reason for first generating the confirmation token and then incrementing the count on the instance in your solution proposal?

jjb commented

I'll put my annotated solution here again, and then also the buggy version:

This works!

class User
  def remind
    # we begin with an unchanged model (changed? == false)

    # changes the model (changed? == true)
    # saves (changed? = false)
    generate_confirmation_token!

    # changes model (changed? == true)
    self.email_confirmation_reminder_count += 1

    # does NOT invoke generate_confirmation_token!
    # sends the email, which devise async queues
    send_confirmation_instructions
    save!
  end
end

This is buggy :'-(

class User
  def remind
    # we begin with an unchanged model (changed? == false)

    # changes model (changed? == true)
    self.email_confirmation_reminder_count += 1


    # invokes generate_confirmation_token!
    #   changes the model (changed? == true)
    #   saves (changed? = false)
    # sends the email, which devise async does NOT queue
    send_confirmation_instructions
    save!
  end
end
jjb commented

@baschtl Okay, after I wrote all that, I realized I didn't understand your suggestion :-D

You are saying why not just make sure the incrementation is persisted to the database before queueing up the email? Yes you are right that is another option. I don't like it for a few reasons.

First of all I generally try to avoid increment!, update_column, etc., because they don't go through the regular ActiveRecord chain of things (validators, filters). So, I have to switch my mental model and assumptions.

Second, with how I do my code now I don't have to think about behavior if any part of the code fails and a database transaction has to be rolled back. Now, I think with your solution, it will be okay, but... I didn't want to have to think about that and figure it out :)

For both reasons above, I think your solution is "more brittle" if/when other code is introduced down the line.

Another reason: I will feel like I'm "fighting" devise and devise_async by trying to hack around their problems and assumptions instead of creating the environment they want.

Now, one might say that my solution is the more "fighting" version because generate_confirmation_token! usually happens within devise and here I am unnaturally invoking it outside of a devise method. This is perhaps a good argument :) but for some reason mine feels more natural.

Also, perhaps the more important overall point: devise_async needs to be fixed so that its queueing system doesn't break when a devise method persists an object. (right?)

@jjb Well, no need for excuses. It's your decision. ;-)

In the end, I don't know what exactly happens with this gem. As mentioned in #64 there might be the possibility to use Active Job. This would make devise-async superfluous. You might want to consider this as another option.

jjb commented

Gotcha -- yes I saw that rails 4.2 seems to have devise-async-like features.

@baschtl just curious about choosing between devise-async and Active Job, and couldn't get enough clarity from the various comments about it.

Please correct me if I'm wrong, but wouldn't active job simply queue the emails to be background processed, but won't actually solve those race conditions between the job being queued and the model not yet being committed to the database?

@gingerlime Hey, I do not have enough experience with Active Job and this particular case to tell the truth. But, yes, this issue could also occur when using Active Job.

However, you should consider using Active Job if it fits your use case as it might be more future proof.

I might have the time over the next days to take a look at this with a minimal implementation.

Thanks @baschtl, no experience with it either, although it kinda makes sense to use it when we upgrade to rails 4.2 ...

Was just wondering about the actual promise of Active Job in this particular case. I can't imagine it would, by itself, resolve this potential race condition between the job being executed and the user creation transaction committed though (devise hooks the sending after save, rathan than after commit). That's why I thought I should ask, in case someone does have any experience with it.