Performance of Postgres notifier - pg_notify is O(N^2)
Closed this issue · 3 comments
Problem
It looks like oban_notify
trigger can become a serious bottleneck when scaling from hundreds to thousands jobs per second because pg_notify
has de-duplication mechanism which has O(N^2)
complexity.
I tried the following benchmark (elixir:1.13.4, 4 vCPU, 120 connections in an Ecto pool, Oban 2.13.4, Postgres 11.19 on AWS RDS t2.medium).
Benchee.run(%{
schedule: fn ->
Repo.transaction(fn ->
job = ExampleWorker.new()
Oban.insert(job)
end)
end,
}, parallel: 100)
I launched it twice: before and after disabling the trigger:
ALTER TABLE oban_jobs disable trigger all;
Observation: disabling the trigger increased the throughput of Oban.insert()
operation from 698 ops/sec
to 1967 ops/sec
...
Name ips average deviation median 99th %
with_pg_notify 6.98 143.32 ms ±16.00% 136.13 ms 236.24 ms
without_pg_notify 19.67 50.84 ms ±84.70% 40.84 ms 284.95 ms
... while decreasing the DB load by 2x (if measured in Average Active Sessions )
On the screenshot below, the first spike is with pg_nofity, the second spike is without. The notification trigger contributes to object
lock type and higher CPU.
Expected solutions
A. Just set expectations in the docs. e.g. "the default Postgres notifications work fine for hundreds RPS.... Consider the PG
notifier to handle thousands RPS"
B. Investigate throttling with pg_notify
. Maybe a configuration setting to balance between "reactiveness" and throughput.
Alternatives Considered
Redis Pub/Sub notifier for high-load Elixir systems without proper cluster setup
Additional Context
- https://medium.com/@ericscouten/elixir-ecto-postgres-a-saga-about-database-performance-488ba59128e
- https://elixirforum.com/t/looking-for-help-with-poor-ecto-query-performance/25476/72
- https://www.postgresql.org/message-id/054001d45a74$23960f10$6ac22d30$@jdemoor.com
Thanks to @marty-stranger for the finding.
Thanks for reporting on your findings. The issue is with the trigger and the notifier, not the notifier by itself. It's perfectly valid to use Postgres notifications without the triggers (necessary for some functionality, even).
Adding some documentation to set expectations is a great idea. Where would you expect to see such a comment? BTW, there's already a note in the PG docs about disabling triggers when migrating.
I'd expect this performance-related notice to be placed into "Caveats" section of Oban.Notifiers.Postgres moduledoc.
@vovayartsev Warning documentation updated. Side note, you mentioned testing on PG 11.19, but only PG 12+ is officially supported (and 14+ is highly recommended).