Is it possible to tell pglogical to start at a specific lsn in the wal?
ApproximateIdentity opened this issue · 1 comments
I'll describe a simplified version of the process to be concrete.
- Create a primary server as
initdb -D ./primary
, set it's wal_level to logical and start it up. - Create a bunch of tables, insert data, etc.
- Reserve a replication slot in some way (the main purpose being to make sure the wal is not garbage collected).
- Stop the server and copy the directory like
cp -r ./primary ./secondary
. Start up both servers (on different ports). - Assume that data has kept writing to the primary after the replication slot was reserved so that the primary has data that the secondary does not have, but also assume the wal has this data preserved. Assume that nothing else like schemas have changes.
Given that setup, I would like to get the secondary caught up with the primary. The secondary is a straight copy so by looking at how far it has gotten in the wal, I should be able to see where in the primary's wal to start replicating. Since the first replication slot has kept the wal from being garbage collected, the data really should be there. So then my question becomes, is it possible to start the replication from that location?
I feel like information theoretically this shouldn't be a problem, but obviously postgres doesn't need to fit my idealized mental model. Is something like this possible with pglogical in some way?
Thanks for any help!
My team was attempting to make use of pglogical
to replicate between Aurora RDS clusters on AWS to facilitate an upgrade from pg11 to pg12 with minimal downtime and we've hit this same wall.
We are taking an RDS snapshot (which is a storage snapshot) of the pg11 cluster, restoring it to a new cluster, upgrading the new cluster to pg12, then establishing replication. Replication is working for us from the time the subscription is created forward, but we haven't found any method to specify starting replication at the point in time LSN from when the snapshot was taken, and for our case, allowing pglogical
to handle the data synchronization doesn't make sense from a time perspective.
Would also appreciate any insight here, as NOT being able to specify this really caught me off guard. Thanks!