toluaina/pgsync

Multiple Bulk Updates Being Made After 1 Record Change

Opened this issue · 0 comments

I am noticing PGSync indexing multiple times when a single record changes.

Example

Given a schema.json file as the following:

[
  {
    "database": "default",
    "index": "my_index",
    "nodes": {
      "table": "my_table",
      "columns": [
        "foo"
      ]
    }
  }
]

I have the following query which executes:

UPDATE my_table
SET foo = "hello"
WHERE id = 1;

What is happening?

Within the following snippet, 1 bulk request happens:

pgsync/pgsync/sync.py

Lines 1228 to 1231 in 1e4c3c7

# forward pass sync
self.search_client.bulk(
self.index, self.sync(txmin=txmin, txmax=txmax)
)

Then, within this next snippet, 1 more bulk request happens:

pgsync/pgsync/sync.py

Lines 1232 to 1233 in 1e4c3c7

# now sync up to txmax to capture everything we may have missed
self.logical_slot_changes(txmin=txmin, txmax=txmax, upto_nchanges=None)

What do I expect to happen instead?

I feel like there should only be 1 request only. Is there a valid reason to index multiple times in this case?