EnterpriseDB/repmgr

repmgr does not reconnect automatically after a disconnect

Opened this issue · 0 comments

Hi,

After a simple firewall change, I notice that the standby node got disconnected from the primary node:

WARNING: node "geo2" not found in "pg_stat_replication"
 ID    | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                            
-------+------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------------------------------------------
 37916 | geo1 | primary | * running |          | default  | 100      | 3        | host=node-1.db.codeshell.com user=repmgr dbname=repmgr port=30432 connect_timeout=10
 38236 | geo2 | standby |   running | ! geo1   | default  | 100      | 3        | host=node-2.db.codeshell.com user=repmgr dbname=repmgr port=30432 connect_timeout=10


WARNING: following issues were detected
  - node "geo2" (ID: 38236) is not attached to its upstream node "geo1" (ID: 37916)

The workaround I found for this problem was to run the "standby follow" command on the standby node.

I'm currently looking for a permanent solution, where I can tell repmgr to automatically connect when such issue is found.

The repmgr.conf for the primary node is the following:

rsync_options='--rsh="ssh -o ConnectTimeout=10 -p 30022"'
ssh_options='-q -o ConnectTimeout=10 -p 30022'
node_id=37916
node_name='geo1'
conninfo='host=node-1.db.codeshell.com user=repmgr dbname=repmgr port=30432 connect_timeout=10'
data_directory='/var/lib/postgresql/data/pgdata'
promote_command='/usr/local/bin/repmgr standby promote -f /etc/repmgr/repmgr.conf'
follow_command='/usr/local/bin/repmgr standby follow -f /etc/repmgr/repmgr.conf --upstream-node-id=%n'
use_replication_slots=true

The repmgr.conf for the standby node is the following:

rsync_options='--rsh="ssh -o ConnectTimeout=10 -p 30022"'
ssh_options='-q -o ConnectTimeout=10 -p 30022'
node_id=37916
node_name='geo2'
conninfo='host=node-1.db.codeshell.com user=repmgr dbname=repmgr port=30432 connect_timeout=10'
data_directory='/var/lib/postgresql/data/pgdata'
promote_command='/usr/local/bin/repmgr standby promote -f /etc/repmgr/repmgr.conf'
follow_command='/usr/local/bin/repmgr standby follow -f /etc/repmgr/repmgr.conf --upstream-node-id=%n'
use_replication_slots=true

Other necessary Information given:

PostgreSQL version: 11.15
repmgr version: 5.3.1
I installed repmgr from source via repository https://github.com/EnterpriseDB/repmgr/archive/refs/tags/v5.3.1.tar.gz

Please let me know if you need other information.
Thanks