
Streaming replication : cannot promote to master while master is down

apompee opened this issue · 2 comments

What is the bug or the crash?

The streaming replication works fine and data is correctly replicated. But when trying to promote to master a node if the master is down, the node will loop over an error trying to connect to master.

pg-master:5432 - no response
[Entrypoint]  Waiting for master to ping...

Promoting to master a node would typically happen when the master is down, so it defeats the purpose of this functionnality.

Steps to reproduce the issue

  1. Go to replication_examples/streaming_replication
  2. make up
  3. Check if data is replicated
  4. Stop master and promote to master the node : docker compose down pg-master && PROMOTE_MASTER="True" DESTROY_DATABASE_ON_RESTART="False" docker compose up -d --scale pg-master=0
  5. make node-log



Additional context

No response

Looks like there are a couple of settings that need to be fixed.

  • pg-node needs to be running in order to initiate a promotion.

So rather exec into the container and run

    pg_ctl promote -D ${DATADIR}

Then scale down master and start the pg-node container

Can you confirm if this works and I will do a PR later to auto fix this logic

I just did the test. I scaled down the master before promoting the node.
To promote the node, I had to do :

# . env-data.sh 
# su postgres -c "pg_ctl promote -D ${DATADIR}"
waiting for server to promote.... done
server promoted

Then, I could insert into the database without restarting anything :

gis=# INSERT INTO sweets (name ,price) values ('Test', 10);
gis=# select * from sweets; 
 id |    name    | price 
  1 | strawberry |  4.50
  2 | Coffee     |  6.20
  3 | lollipop   |  3.80
  4 | Test       |    10