toluaina/pgsync

Running `bootstrap` triggers a re-index of the entire schema

jvanderen1 opened this issue · 2 comments

I notice that when I re-run boostrap, the schema re-indexes entirely. This is not ideal for a few reasons:

  • Nothing should have changed with boostrap being called multiple times. The appropriate triggers and replication slots should not be re-added.
  • Anytime we wish to add/remove a table or create a new index, the bootstrap command will re-index every schema. A partial solution for us is to break the schema.json into multiple files, which would not be ideal.
  • First, do you mean running bootstrap, and then pgsync will restart the indexing from the beginning?
  • So the way this works was actually deliberate.
  • bootstrap was intended to be run just once.
  • if you run it multiple times, the assumption is that the schema has changed e.g additional columns, relationships etc.
  • the only way to ensure the bootstrapped state matches the schema is to re-create the views required for pgsync to run correctly

@toluaina I came up with a solution to only run bootstrap once in our production database by splitting our schema into multiple files then combining. This is not exactly ideal as we are looking to anytime we are looking to migrate our database or add/modify the Elasticsearch indices, this triggers a re-index across all indices.

A couple more questions:

  • If our database migrates a table/column that is not observed by PGSync, will we need to re-trigger bootstrap?
  • Is there any plan to add a migration path for adding / modifying the schema? This feels like a huge oversight for anything production ready.