QueueClassic/queue_classic

queue classic not thread safe

Closed this issue · 3 comments

We have been running into trouble with corrupted connections happening since we started using queue_classic, and I believe we've tracked down the issue. We are running multi-threaded rails with connection pooling, and it seems that QC stashes a single ActiveRecord connection in its default_conn_adapter class member, namely the connection at the time the conn adapter is created. This results in this single connection adapter being re-used across all threads for all subsequent enqueue calls.

This is obviously unsafe, for two reasons. First, there is a chance two threads doing enqueues simultaneously will reference the connection. Second, it is not the connection for the active thread and may actually be being used by another thread if that thread has checked the same one out of the pool.

The fix we've implemented for now is to monkey patch the rails queue classic adapter to create a new connection adapter each time pointing it to the thread's connection. This seems too hacky for a PR to rails, but might be the correct solution.

dup2 commented

Can you provide a sample for this? We are facing a similar issue and wanted to tackle this also with better connection handling when running our own daemon with threaded workers. Thanks.

kolen commented

This results in this single connection adapter being re-used across all threads for all subsequent enqueue calls.

Is this a problem? pg gem is quite thread-safe, as its underlying libpq, connections created in one thread can be reused in another. AFAIK, activerecord pool works this way: thread returns connection to pool, another thread can take it. Moreover, queue_classic has mutex for queries.

There's possibility of building libpq without thread safety, you can check PG::Connection.isthreadsafe, but AFAIK this is rare.

Second, it is not the connection for the active thread and may actually be being used by another thread if that thread has checked the same one out of the pool.

What's the problem with pool? Queue_classic takes one connection from the pool and never checks out it back. Connections can be safely reused across threads.

What error messages you get? I encountered protocol errors because of reused PG connections, but in forked web server, because there were multiple PG::Connection instances in different processes using single underlying socket created prior to to forking.

ukd1 commented

Closing as this is stale; @gfodor if you can share the code, I'd def check it out.