jmettraux/rufus-scheduler

Rufus and Database Connections

Closed this issue · 4 comments

Based on the rufus documentation, I have rufus initialized as a singleton in my rails app. As you note in this answer, the "scheduler lives as long as the Ruby on Rails process lives."

I have a scheduler that runs every 20 minutes and incorporates a db query. Recently I ran into this exception: "PG::UnableToSend: no connection to the server." Once this error began occurring, it occurred repeatedly every time the scheduler executed a job.

After finding more details here and here, I understand that db connections must be managed manually and that putting my code inside a with_connection block will ensure the release of a db connection after code execution completes.

From my understanding, the singleton spawns a new thread for every job. Those threads can share a database connection. So the singleton checks out a database connection on the first job that is run, maintains that same connection once the first thread completes, and then uses that connection when the next job runs.

If my understanding is correct, wouldn't another solution to the problem I described above be to not use a singleton? If a new rufus instance is initialized every time a job is run, won't that instance have to create a new database connection because there is no connection maintained across jobs? If not, then how would rufus maintain a database connection once a thread completes without a long-running singleton?

If my alternative solution would also work, then what is the benefit of using a singleton in the first place (at least for rails apps where db operations are very common)? Why not just initialize a new rufus instance every time? It seems to me that ensuring a fresh db connection is made this way outweighs the tradeoff in performance/memory of spinning up a new instance every time a job is run. And in the alternative solution, a developer does not have to wrap every usage of rufus in a with_connection block.

Please let me know if you think either alternative would work (with_connection block or not using a singleton) to handle db connection exceptions. If so, which approach do you recommend, and why?

Based on the rufus documentation, I have rufus initialized as a singleton in
my rails app. As you note in this answer, the "scheduler lives as long as the
Ruby on Rails process lives."

I have a scheduler that runs every 20 minutes and incorporates a db query.
Recently I ran into this exception: "PG::UnableToSend: no connection to the
server." Once this error began occurring, it occurred repeatedly every time
the scheduler executed a job.

After finding more details here and here, I understand that db connections
must be managed manually and that putting my code inside a with_connection
block will ensure the release of a db connection after code execution
completes.

From my understanding, the singleton spawns a new thread for every job.

No, a rufus-scheduler instance has a pool of work threads. Each time a job triggers, a free work thread is used for it. If necessary (all work threads already assigned) a new work thread will be created and assigned.

Those threads can share a database connection. So the singleton checks out a
database connection on the first job that is run, maintains that same
connection once the first thread completes, and then uses that connection
when the next job runs.

If my understanding is correct, wouldn't another solution to the problem I
described above be to not use a singleton? If a new rufus instance is
initialized every time a job is run, won't that instance have to create a new
database connection because there is no connection maintained across jobs? If
not, then how would rufus maintain a database connection once a thread
completes without a long-running singleton?

A rufus-scheduler instance doesn't check out any connection. There is nothing in rufus-scheduler concerned with ActiveRecord, Rails, Sequel, whatever.

Having a with_connection block in your job is elegant and sufficient.

If my alternative solution would also work, then what is the benefit of using
a singleton in the first place (at least for rails apps where db operations
are very common)? Why not just initialize a new rufus instance every time? It
seems to me that ensuring a fresh db connection is made this way outweighs
the tradeoff in performance/memory of spinning up a new instance every time a
job is run. And in the alternative solution, a developer does not have to
wrap every usage of rufus in a with_connection block.

Please let me know if you think either alternative would work
(with_connection block or not using a singleton) to handle db connection
exceptions. If so, which approach do you recommend, and why?

Having a singleton, in other words, a handy reference to a single rufus-scheduler instance is very convenient. Do you need more schedulers for what?

I repeat, rufus-scheduler knows nothing about databases and database connections, it deals in scheduling, only in scheduling.

The best solution is to use with_connection and stay with a single rufus-scheduler instance (the singleton).

Closing the issue, but feel free to ask for clarifications here.

Best regards and a happy new year!

@jmettraux Hi John, thanks for the additional details! Understood that rufus is not concerned with ActiveRecord or Rails. But after ActiveRecord checks out the connection, Rails assigns that connection to the rufus singleton, correct? And that association is long-lived because the singleton is meant to stay in memory as long as the app is running, correct?

If that's the case, the fundamental question is still relevant. Wouldn't another solution to this problem be not using a singleton, and instead instantiating a new rufus instance every time a job is run?

Will incorporate the with_connection block in my code, but at this point I'm trying to get a deeper understanding of the issue. Please let me know your thoughts. Thanks again and happy new year!

Hello Manish,

Understood that rufus is not concerned with ActiveRecord or Rails. But after ActiveRecord checks out the connection, Rails assigns that connection to the rufus singleton, correct?

Sorry, I should also have stated that there is nothing in Rails or ActiveRecord concerned with rufus-scheduler. So it's incorrect.

And that association is long-lived because the singleton is meant to stay in memory as long as the app is running, correct?

It's incorrect because there is no association.

If that's the case, the fundamental question is still relevant. Wouldn't another solution to this problem be not using a singleton, and instead instantiating a new rufus instance every time a job is run?

It's not the case, the question is not relevant.

Also it reads like you'd want a rufus-scheduler instance scheduling a job and then creating a new rufus-scheduler instance as the job is running... That sounds like a waste of resource. Why should a scheduler schedule another scheduler to schedule a job, when it can schedule it directly by itself.

Please look at ActiveRecord's code, especially the implementation of with_connection, it might answer your question. In my vague memories of looking into that (10+ years ago). ActiveRecord was placing its database connections in thread-local variables. Now remember that rufus-scheduler uses work threads. The thread underlying your particular run of a rufus-scheduler job is not the thread that answers your HTTP request, with_connection is your friend, IIRC.

Best regards!

Ok, good to know, thank you again for the context!