samsondav/rihanna

Postgres error: "you don't own a lock of type ExclusiveLock"

tombh opened this issue · 5 comments

tombh commented

I stepped through the Rhianna versions and found that this error is produced in v1.3.3 onwards.

[error] GenServer Rihanna.JobDispatcher terminating
** (MatchError) no match of right hand side value: %Postgrex.Result{columns: ["pg_advisory_unlock"], command: :select, connection_id: 21507, messages: [%{code: "01000", file: "lock.c", line: "1901", message: "you don't own a lock of type ExclusiveLock", routine: "LockRelease", severity: "WARNING", unknown: "WARNING"}], num_rows: 1, rows: [[false]]}
    (rihanna) lib/rihanna/job.ex:342: Rihanna.Job.release_lock/2
    (rihanna) lib/rihanna/job.ex:302: Rihanna.Job.mark_successful/2
    (rihanna) lib/rihanna/job_dispatcher.ex:67: Rihanna.JobDispatcher.handle_info/2
    (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

Please ask if you need more details, but this is all on a very standard setup; Postgres v11.3 and Ecto, etc all installed at their latest versions from yesterday.

Should I even be installing Rhianna above v1.3.0? I just saw the advice to use 0.0.0 in the README and assumed it was an out of date reference, so I looked through the commits to find the latest version.

lpil commented

I'm getting this too after updating to the latest version.

v1.3.1...v1.3.4

Unsure what the problem is here, but there's very little code to explore!

lpil commented

Thanks for the report @tombh

@lpil I'm not certain this issue is resolved. I'm still getting a boatload of these errors in production with version 1.3.5. All coming with a similar stacktrace to what @tombh mentioned, following a mark_successful.

Actually... chalk this one up to novice DBA-ing right here. I'm gonna leave my notes in case anyone else runs into this.

Short answer: Advisory locks were sometimes not being held onto because of my pg_bouncer connection pool settings, leading to the following symptoms:

  • An inordinate number of multiply-executed jobs
  • Jobs being "orphaned" in the database with no failed_at but seemingly being "skipped over" by the workers (...because the DB was holding a lock for a process that didn't close it)
  • The error @tombh mentioned above showing up in logs from Rihanna.Job.release_lock/2

My solution was specific to my DBaaS provider (Digital Ocean), and may be translate-able to other scenarios:

  • The default Pooling Mode for pg_bouncer was set to Transaction Mode, which I believe is what was causing the orphaning above
  • In the text for my seemingly-successful switch, Session Mode, their documentation states:

Session mode is useful when your application uses prepared statements, advisory locks, listen/notify, or other features which operate on a session level rather than a transaction level.

...which has seemed to me to be the key here.

I'm not certain this is the fix. But it seems to be working at the moment. If this is an uncommon occurrence, then my note here may help folks out. And if it's something we suspect folks might run into with other DB providers, maybe there's room to flag it as a gotcha for folks in documentation?

@lperiodbose Yup when I saw this my first thought was pgbouncer, which would break any application that relies on advisory locks.

Would love to see a PR adding this to the README or similar!