EnterpriseDB/repmgr

WAL replay is paused on nodes with WAL replay pending

wasiualhasib opened this issue · 0 comments

Here issue is that after doing PITR for specific time wal replay resume at all node. For primary when I run pg_wal_replay_resume() that issue gone.

When I clone from primary to make standby it cloning and start standby node successfully but wal replay resume with wal replay pending. I am not sure where is the issue.

If I run wal_replay_resume() at standby in that case standby.signal gone. But as a standby node standby.signal should be there. Later I thought i need to wal replay, as like primary. Where there was a barman_wal folder which stored required wal file to replay. Next I copy that barman_wal folder all files and directory to standby node and also create recovery.signal file there and restart that standby node but server run successfully but now showing "node "node2" (ID: 2) is not attached to its upstream node "node1" (ID: 1)"

I am not sure how to do PITR for 3 node cluster where 1 is primary and other two node is standby. For single node it is working well.

`[postgres@DB3 ~]$ repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 9 | host=10.16.71.78 user=repmgr dbname=repmgr connect_timeout=2 port=5432
2 | node2 | standby | running | node1 | default | 100 | 9 | host=10.16.71.79 user=repmgr dbname=repmgr connect_timeout=2 port=5432
3 | node3 | standby | running | node1 | default | 100 | 9 | host=10.16.71.80 user=repmgr dbname=repmgr connect_timeout=2 port=5432
4 | Witness | witness | * running | node1 | default | 0 | n/a | host=10.16.71.77 user=repmgr dbname=repmgr connect_timeout=2 port=5432

WARNING: following issues were detected

  • WAL replay is paused on node "node2" (ID: 2) with WAL replay pending; this node cannot be manually promoted until WAL replay is resumed
  • WAL replay is paused on node "node3" (ID: 3) with WAL replay pending; this node cannot be manually promoted until WAL replay is resumed

If anyone know the issue please let me know.`

@ibarwick need your help on this.