EnterpriseDB/docs

Regarding "Backup and Recovery: Single-Server Streaming - Recovery"

alcidesalarcon opened this issue · 1 comments

Summary

Hola:

A) SERVIDOR MASTER
#####################################################
a.1) Al configurar en mi servidor "Master" en postgresql.conf

archive_mode = on
archive_command = 'barman-wal-archive db-backup test_master %p'

Se genera el siguiente error en mi servidor master:
2024-02-13 14:39:15 ERROR: Error executing ssh: [Errno 32] Broken pipe
2024-02-13 14:39:15 Exception ignored in: <function _Stream.del at 0x7fb1badba5e0>
2024-02-13 14:39:15 Traceback (most recent call last):
2024-02-13 14:39:15 File "/usr/lib/python3.9/tarfile.py", line 410, in del
2024-02-13 14:39:15 self.close()
2024-02-13 14:39:15 File "/usr/lib/python3.9/tarfile.py", line 460, in close
2024-02-13 14:39:15 self.fileobj.write(self.buf)
2024-02-13 14:39:15 ValueError: write to closed file
2024-02-13 14:39:15 2024-02-13 14:39:15.338 -03 [33] LOG: archive command failed with exit code 2
2024-02-13 14:39:15 2024-02-13 14:39:15.338 -03 [33] DETAIL: The failed archive command was: barman-wal-archive db-backup test_master pg_wal/00000001000000000000000B
2024-02-13 14:39:15 2024-02-13 14:39:15.338 -03 [33] WARNING: archiving write-ahead log file "00000001000000000000000B" failed too many times, will try again later

A.2) servicios postgres corriendo:
root@83536e06d59a:/opt# ps aux | grep postg

postgres 1 0.0 0.7 3307752 97868 ? Ss 17:21 0:00 postgres
postgres 29 0.0 0.2 3307872 31580 ? Ss 17:21 0:00 postgres: checkpointer
postgres 30 0.0 0.2 3307752 26552 ? Ss 17:21 0:00 postgres: background writer
postgres 31 0.0 0.1 3307752 22136 ? Ss 17:21 0:00 postgres: walwriter
postgres 32 0.0 0.0 3308288 8764 ? Ss 17:21 0:00 postgres: autovacuum launcher
postgres 33 0.0 0.0 3307752 6600 ? Ss 17:21 0:00 postgres: archiver failed on 00000001000000000000000B
postgres 34 0.0 0.0 68128 4964 ? Ss 17:21 0:00 postgres: stats collector
postgres 35 0.0 0.0 3308180 6836 ? Ss 17:21 0:00 postgres: logical replication launcher
postgres 52 0.0 0.0 3308448 9216 ? Ss 17:21 0:00 postgres: walsender postgres 172.19.0.3(43498) streaming 0/F0001C0
postgres 89 0.0 0.0 3308824 11176 ? Ss 17:24 0:00 postgres: walsender streaming_barman 172.19.0.4(57344) streaming 0/F0001C0
root 331 0.0 0.0 6236 648 pts/0 S+ 17:46 0:00 grep postg

B) SERVIDOR BACKUP-BARMAN
#####################################################
b.1) Pero sin embargo, si consulto en mi servidor de backup-barman, todo está "ok":

root@f19c0cfde36b:/# barman replication-status test_master

Status of streaming clients for server 'test_master':
Current LSN on master: 0/F0001C0
Number of streaming clients: 2

  1. Async standby
    Application name: walreceiver
    Sync stage : 5/5 Hot standby (max)
    Communication : TCP/IP
    IP Address : 172.19.0.3 / Port: 43498 / Host: -
    User name : postgres
    Current state : streaming (async)
    Replication slot: slot_1
    WAL sender PID : 52
    Started at : 2024-02-13 14:21:40.679493-03:00
    Sent LSN : 0/F0001C0 (diff: 0 B)
    Write LSN : 0/F0001C0 (diff: 0 B)
    Flush LSN : 0/F0001C0 (diff: 0 B)
    Replay LSN : 0/F0001C0 (diff: 0 B)

  2. Async WAL streamer
    Application name: barman_receive_wal
    Sync stage : 3/3 Remote write
    Communication : TCP/IP
    IP Address : 172.19.0.4 / Port: 57344 / Host: -
    User name : streaming_barman
    Current state : streaming (async)
    Replication slot: barman
    WAL sender PID : 89
    Started at : 2024-02-13 14:24:33.860917-03:00
    Sent LSN : 0/F0001C0 (diff: 0 B)
    Write LSN : 0/F0001C0 (diff: 0 B)
    Flush LSN : 0/F000000 (diff: -448 B)

b.2) Chequeando el servidor configurado en barman, todo "ok":

root@f19c0cfde36b:/# barman check test_master

Server test_master:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
backup minimum size: OK (0 B)
wal maximum age: OK (no last_wal_maximum_age provided)
wal size: OK (0 B)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 0 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
systemid coherence: OK (no system Id stored on disk)
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archiver errors: OK

Where did you see the problem?

https://github.com/EnterpriseDB/docs/blob/main/advocacy_docs/supported-open-source/barman/single-server-streaming/step04-restore.mdx

Expected behavior

No response

Screenshots

No response

Browser / Platform

Dockerfile
FROM postgres:14-bullseye

Additional notes

No response

@alcidesalarcon I wouldn't worry too much about that error initially; if you're setting up streaming replication, it isn't absolutely necessary for this command to work (or exist).

However... If it's failing because you don't nave SSH configured to allow the two machines to talk to each other, you're going to want to fix that; if nothing else, restoring a backup will tend to work a lot better.