paunin/PostDock

Postgres cannot start after restart!

ttnghia195 opened this issue · 1 comments

Repmgr try to connect postgres but it dose note complete start.
This is log on master node.

>>> Setting up STOP handlers...
>>> STARTING SSH (if required)...
>>> SSH is not enabled!
>>> STARTING POSTGRES...
>>> SETTING UP POLYMORPHIC VARIABLES (repmgr=3+postgres=9 | repmgr=4, postgres=10)... 
>>> TUNING UP POSTGRES...
>>> Configuring /var/lib/postgresql/data/postgresql.conf
>>>>>> Will add configs to the exists file
>>>>>> Adding config 'shared_preload_libraries'=''repmgr'' 
>>> Check all partner nodes for common upstream node...
>>>>>> Checking NODE=sp-db-node-0.sp-db...
psql: could not translate host name "sp-db-node-0.sp-db" to address: Name or service not known
>>>>>> Skipping: failed to get master from the node!
>>>>>> Checking NODE=sp-db-node-1.sp-db...
psql: could not translate host name "sp-db-node-1.sp-db" to address: Name or service not known
>>>>>> Skipping: failed to get master from the node!
>>>>>> Checking NODE=sp-db-node-2.sp-db...
psql: could not translate host name "sp-db-node-2.sp-db" to address: Name or service not known
>>>>>> Skipping: failed to get master from the node!
>>> Auto-detected master name: ''
>>> Setting up repmgr...
>>> Setting up repmgr config file '/etc/repmgr.conf'...
>>> Setting up upstream node...
>>> Sending in background postgres start...
>>> Waiting for local postgres server recovery if any in progress:LAUNCH_RECOVERY_CHECK_INTERVAL=30
>>> Recovery is in progress:
>>>>>> RECOVERY_WAL_ID is empty!
>>> Not in recovery state (anymore)
>>> Waiting for local postgres server start...
>>> Wait schema replica_db.public on sp-db-node-0.sp-db:5432(user: replica_user,password: *******), will try 9 times with delay 10 seconds (TIMEOUT=90)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 9 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 8 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 7 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 6 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 5 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 4 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 3 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 2 times more)
psql: could not connect to server: Connection refused
	Is the server running on host "sp-db-node-0.sp-db" (192.168.247.234) and accepting
	TCP/IP connections on port 5432?
>>>>>> Host sp-db-node-0.sp-db:5432 is not accessible (will try 1 times more)
>>> Schema replica_db.public is not accessible, even after 9 tries!

Maybe, the time that the local postgres launch depends on the amount of data

Just increase REPMGR_WAIT_POSTGRES_START_TIMEOUT and it work, but it may not the best solution.