2ndQuadrant/pglogical

pglogical.wait_slot_confirm_lsn docs are incorrect

jcoleman opened this issue · 1 comments

The docs for pglogical.wait_slot_confirm_lsn say:

Wait until all replication slots on the current node have replayed up to the xlog insert position at time of call on all providers. Returns when all slots' confirmed_flush_lsn passes the pg_current_wal_insert_lsn() at time of call.

but looking at the code that's not what actually happens:

	if (PG_ARGISNULL(1))
	{
		if (XLogRecPtrIsInvalid(XactLastCommitEnd))
			target_lsn = GetXLogInsertRecPtr();
		else
			target_lsn = XactLastCommitEnd;
	}

Contra the docs we use the LSN for the last commit performed on the current backend (if one is available). That means that it's possible for the following scenario to occur:

  1. Worker A: COMMIT (lsn 123); confirmed_flush_lsn = 0
  2. Worker B: COMMIT (lsn 456); confirmed_flush_lsn = 123
  3. Worker A: wait_slot_confirm_lsn(slot, NULL => turned into last commit at 123); confirmed_flush_lsn = 123
  4. Worker A: wait_slot_confirm_lsn returns
  5. COMMIT (lsn 456) replicates

BTW the behavior here diverged from the documented behavior in b72a618 which sadly has no context in the commit message other than the fact that this was backported from pglogical3, so presumably there was an intentional reason to change this in v3, but there's no information as to why that's desirable, and the docs certainly weren't changed to match.