promoting standby does not change role
Closed this issue · 7 comments
hello,
if primary fails and rep mgr promotes standby, the "rep mgr cluster show shows role of standby as standby only but has a moment
! standby running as primary
how do I fix this issue and why wouldn't the role be changed to primary?
I have rep mgr 4.3 for postgresql 11.
- which node are you running
repmgr cluster show
on? - is the former primary still running?
You will see this output if a standby was promoted, but (for whatever reason) the original primary is still running and you run repmgr cluster show
on the original primary, as it's not expecting to see its former standby running as primary.
If this is the case, you may be able to rejoin the former primary as a standby of the new primary by executing repmgr node rejoin
, possibly using the --force-rewind
option if available and appropriate.
Documentation: repmgr node rejoin
if I remove the code for promotion & automatic failover in /etc/repmgr.conf , the standby changes status to "! running as primary" but role remains standby.
then when i try
rep mgr standby promote
it says
ERROR!: STANDBY PROMOTE can only be used on a standby node
The configuration for my setup is taken from this
https://medium.com/@victor.boissiere/how-to-setup-postgresql-cluster-with-repmgr-febc2f10c243
what exactly is triggering this status change? if I can keep the setup intact, i will be able to try line by line on the codlin what rep mgr.conf specifies.
i found that rep mgr standby promote does change the standby to primary but as soon as the original primary comes back, the standby changes its role to standby from primary and status says it is running as primary. in my setup primary may crash often but seldom go down permanently. it may also be rebooted during a software upgrade which meanest is going to come back.
@kamal-prasad in that case, the correct way to bring it back should be repmgr node rejoin -h current_primary
as Ian already stated.
Either that, or disable repmgr daemon (repmgr daemon pause
) which means if the primary does go down "permanently", you'll need to promote the standby yourself somehow (aka it won't happen automatically like it does now).
Which ever you choose depends on what fits your setup the best, there's no perfect solution for this at the moment.
I think if repmgr on the old primary will try to rejoin the cluster automatically if it sees there's another primary, instead of letting the PostgreSQL instance come up as primary even though there's a new one, that would be the solution and issues like this won't come up.
@ibarwick perhaps consider adding this feature if it isn't already planned?
Or is there some reason I didn't think of why this shouldn't be the case?