oban-bg/oban

Missing unique constraint on `oban_peers` table

Closed this issue · 6 comments

Environment

  • Oban 2.15.2, Oban Pro 1.0.0
  • PostgreSQL 14.6
  • Elixir 1.15.2, Erlang/OTP 26.0.2
  • Clustered deployment with two instances, using Oban.Peers.Postgres

Current Behavior

With the above configuration (specifically, a clustered deployment with two or more instances and Oban.Peers.Postgres configured), we see the following repeatedly in our Postgres logs:

2023-07-31 23:02:42.157 UTC [20516] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2023-07-31 23:02:42.157 UTC [20516] STATEMENT: INSERT INTO "public"."oban_peers" AS o0 ("name","node","started_at","expires_at") VALUES ($1,$2,$3,$4) ON CONFLICT ("name") DO UPDATE SET "expires_at" = $5

As the error suggests, there is no unique constraint on the oban_peers table; none is specified in the lib/oban/migrations/postgres/v11.ex migration.

Expected Behavior

This INSERT statement should not raise an error. Either the ON CONFLICT clause should be removed (or more likely) a unique constraint should be specified matching that clause.

Odd, I've never seen that Postgres error logged before. The primary key for oban_peers is the name column, which is in effect a unique constraint. When I run the following query multiple times:

insert into oban_peers (name, node, started_at, expires_at) values ('Oban', 'worker-1@127.0.0.1', now(), now()) on conflict (name) do update set updated_at = now();

The operation succeeds, and nothing is logged. Can you verify the structure of your oban_peers table?

This is our oban_peers table:

# \d oban_peers
                    Unlogged table "public.oban_peers"
   Column   |            Type             | Collation | Nullable | Default
------------+-----------------------------+-----------+----------+---------
 name       | text                        |           | not null |
 node       | text                        |           | not null |
 started_at | timestamp without time zone |           | not null |
 expires_at | timestamp without time zone |           | not null |

Notably, we have no primary key. I'm not sure how that can have happened.. but it explains the behaviour. Our app is always clustered and so Oban.Peers.Global works for us. so I think we can close this issue. Sorry for the false alarm!

rossvz commented

We just hooked up some DB tracing today and started noticing a flood of these coming in. Our oban_peers table does appear to have "duplicates"

 name |          node          |         started_at         |         expires_at
------+------------------------+----------------------------+----------------------------
 Oban | app@XXX.XX.XX.251 | 2023-09-21 18:44:27.262174 | 2023-09-21 18:44:57.262174
 Oban | app@XXX.XX.XX.231 | 2023-09-21 18:44:31.128227 | 2023-09-21 18:45:01.128227
 Oban | app@XXX.XX.XX.231 | 2023-09-21 18:44:45.639172 | 2023-09-21 18:45:15.639172
 Oban | app@XXX.XX.XX.251 | 2023-09-21 18:44:46.952235 | 2023-09-21 18:45:16.952235

@simoncocking Did you do anything specifically to resolve these?

Specifically? We moved to Oban.Peers.Global 😅 since we're distributed, rather than trying to resolve this particular issue.

@rossvz Make sure that the oban_peers table has name set as the PRIMARY KEY:

Indexes:
    "oban_peers_pkey" PRIMARY KEY, btree (name)
rossvz commented

@rossvz Make sure that the oban_peers table has name set as the PRIMARY KEY:

Indexes:
    "oban_peers_pkey" PRIMARY KEY, btree (name)

Huh, it looks like locally I have that index but not on my staging servers where I was getting the errors. I do think we probably should be using Oban.Peers.Global since we are running those servers clustered/distributed but that also explains the errors we're seeing now