EnterpriseDB/repmgr

After failed primary rejoin the new primary, the postgre service is inactive in both servers

Opened this issue · 4 comments

In a 2 server setup of master and standby, when I stop the postgre service in master, the standby becomes new master. Then I rejoin the old master to new master and it is added as new standby, but when I check the status of postgre service in both servers, it is in inactive state, however I can still connect to psql in new master and perform insertions in a database table, which are reflected in new stand by. Why is this happening? When I try to explicitly start the postgre service it throws and error saying file already present “postmaster.pid” in the directory “/var/lib/pgsql/12/data”. When I delete the file “postmaster.pid” from the directory “/var/lib/pgsql/12/data” in both servers and start the postgre service in new master then the service starts working in new standby.

After this if I stop the postgre service in new master, the new standby does not automatically gets promoted as master. Why this is not happening?

Got it fixed using settings from the link:
https://repmgr.org/docs/4.0/configuration-service-commands.html
But looking for an option to automate the rejoining. Is it possible to do in repmgr?

Can split brain be avoided by introducing a witness (3rd server) in 2 server master-standby repmgr based setup?

Introduced a witness (3rd server) in 2 server master-standby repmgr based setup but still facing split brain. Any pointers in this situation?