Can’t deploy 0.18.4, connection errors
Christoph-Wagner opened this issue · 7 comments
Did you check the FAQ & Troubleshooting section for answers to common questions and issues?
Yes
Describe the issue
Deployment fails, I extracted what seemed relevant from the failure log:
First ui errors:
lemmy-easy-deploy-lemmy-ui-1 | http://0.0.0.0:1234
lemmy-easy-deploy-lemmy-ui-1 | API error: FetchError: request to http://lemmy:8536/api/v3/site? failed, reason: getaddrinfo ENOTFOUND lemmy
Then what probably blocks deployment from lemmy-easy-deploy-lemmy-1 (this error repeats 7 times
thread 'main' panicked at 'Error connecting to postgres://lemmy:password@postgres:5432/lemmy: could not connect to server: Connection refused Is the server running on host "postgres" (172.21.0.3) and accepting TCP/IP connections on port 5432?
Caddy / lemmy-easy-deploy-proxy-1 also throws some connection errors:
dial tcp 172.21.0.4:8536: connect: connection refused
dial tcp: lookup lemmy on 127.0.0.11:53: no such host
dial tcp 172.21.0.4:8536: i/o timeout
Postgres seems to run fine despite not binding to ipv6
lemmy-easy-deploy-postgres-1 | 2023-08-09 03:46:54.922 GMT [1] LOG: listening on IPv4 address "127.0.0.1", port 5432
lemmy-easy-deploy-postgres-1 | 2023-08-09 03:46:54.922 GMT [1] LOG: could not bind IPv6 address "::1": Address not available
For the sake of completeness, the rather boring ./custom/customPostgresql.conf
:
# DB Version: 15
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 4 GB
# CPUs num: 2
# Data Storage: ssd
max_connections = 200
shared_buffers = 1GB
effective_cache_size = 3GB
maintenance_work_mem = 256MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 2621kB
min_wal_size = 1GB
max_wal_size = 4GB
Diagnostic Information
Run
./deploy.sh -d
and paste the output below:
==== Docker Information ====
Detected runtime: docker (Docker version 24.0.5, build ced0996)
Detected compose: docker compose (Docker Compose version v2.20.2)
Runtime state: OK
==== System Information ====
OS: Linux
KERNEL: 6.1.0-9-arm64 (aarch64)
HOSTNAME: OK
SHELL: bash
MEMORY:
total used free shared buff/cache available
Mem: 3.7Gi 382Mi 2.3Gi 3.8Mi 1.3Gi 3.4Gi
Swap: 0B 0B 0B
DISTRO:
----------------------------
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_CODENAME=bookworm
----------------------------
==== Lemmy-Easy-Deploy Information ====
Version: 1.3.0
IMAGE CREATED STATUS
Integrity:
0d3e213450ba646ab61881103a7ffcb2283b8152f36fff97ab735a704f069aa7 ./deploy.sh
587ca168ac5a0d1644df650d100711197c66fb6bf854f7cce0e29df35369e9c1 ./templates/Caddy-Dockerfile.template
c1202e70662dd2228da36a35a0f38ec8fc81bec8964d7315d02e8671a58dd7d7 ./templates/Caddyfile.template
2537678c7971df36c1ed95f4228d3cfcb15bb4a28a60d939eaf8dd75b5d64a36 ./templates/cloudflare.snip
c9cb4c5fee12930e17798a02ae1bd12e2dc69e149a394c24511bc9d4e6b776d4 ./templates/compose-email.snip
c494a610bcb4cd1cfc0a4fe4fb0f6d437b2a84a0ad1625daee240e6dd6f1c910 ./templates/compose-email-volumes.snip
f5325a9e26b29da51c6d3295aa278ff08ce71ffd2cd63dc4bebf00e54c468899 ./templates/docker-compose.yml.template
1c202b1b6e87c65b2fcda6035c9fe3f8631d76662907ffd38f24b14686e30647 ./templates/lemmy-email.snip
c834cdce9eaf77f38155b404724fdfe66845575386ee516987452aa715642a6f ./templates/lemmy.hjson.template
Custom Files:
total 4.0K
-rw-r--r-- 1 0 0 406 Jul 4 16:11 customPostgresql.conf
==== Settings ====
CLOUDFLARE: No
CADDY_DISABLE_TLS: false
CADDY_HTTP_PORT: 80
CADDY_HTTPS_PORT: 443
LEMMY_TLS_ENABLED: true
ENABLE_EMAIL: true
SMTP_PORT: 465
ENABLE_POSTFIX: false
POSTGRES_POOL_SIZE: 100
==== Generated Files ====
Deploy Version: 0.18.3;0.18.3
total 19M
drwxr-xr-x 2 0 0 4.0K Jul 6 16:46 caddy
-rw-r--r-- 1 0 0 32 Aug 9 03:46 caddy.env
-rw-r--r-- 1 70 0 406 Aug 9 03:46 customPostgresql.conf
-rw-r--r-- 1 0 0 1.7K Aug 9 03:46 docker-compose.yml
-rw-r--r-- 1 0 0 50 Jul 4 16:13 lemmy.env
-rw-r--r-- 1 0 0 695 Aug 9 03:46 lemmy.hjson
-rw-r--r-- 1 0 0 19M Jul 29 06:18 lemmy_log.out
-rw-r--r-- 1 0 0 49 Jul 4 16:13 pictrs.env
-rw-r--r-- 1 0 0 36 Aug 9 03:46 postfix.env
-rw-r--r-- 1 0 0 51 Jul 4 16:13 postgres.env
-rw-r--r-- 1 0 0 14 Jul 28 14:35 version
Something is very wrong with your internal Docker networking. None of your services can talk to each other.
Docker isn't responding properly to some DNS requests:
dial tcp: lookup lemmy on 127.0.0.11:53: no such host
And in some cases DNS requests resolve properly but no connection is allowed:
dial tcp 172.21.0.4:8536: connect: connection refused
These issues would impact other Docker Compose services on your machine too, not just Lemmy Easy Deploy.
I do not know what to recommend to assist you with this. Things you can try:
- Ensure your system hostname does not contain
lemmy
(check with thehostname
command). - Ensure your system does not have any
iptables
orufw
rules that interfere with Docker. - Try completely rebooting this system and trying again.
If that doesn't work, I'm not sure, sorry :(
Hopefully it's one of those quick fixes!
The interesting part is, that the old (0.18.3) deployment was happily running, no connection errors or anything there.
Iptables should be the default (and again, also work for the old version):
root@lemmy-main: iptables -S
-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
Ensure your system hostname does not contain
lemmy
.
It does: lemmy-main
, but that was never an issue before. I restarted & changed it to a different one (with hostname newname
after the restart), same issues.
But I also found more issues, docker (often even after restarting) would not list anything with docker ps
, the old version would not run reliably anymore, logs would not show up.
I restored the server from backup, and found out that iptables used to have a few more lines:
-A FORWARD -o br-3f409d2b7bca -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-3f409d2b7bca -j DOCKER
-A FORWARD -i br-3f409d2b7bca ! -o br-3f409d2b7bca -j ACCEPT
-A FORWARD -i br-3f409d2b7bca -o br-3f409d2b7bca -j ACCEPT
-A DOCKER -d 172.19.0.3/32 ! -i br-3f409d2b7bca -o br-3f409d2b7bca -p tcp -m tcp --dport 443 -j ACCEPT
-A DOCKER -d 172.19.0.3/32 ! -i br-3f409d2b7bca -o br-3f409d2b7bca -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-3f409d2b7bca ! -o br-3f409d2b7bca -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-2 -o br-3f409d2b7bca -j DROP
I also found out that attempting deployment of 0.18.4 again would break things again, in the same way.
I’m now planning to leave 0.18.3 running for a while to see if those issues appear for anyone else and maybe have a fix.
I know the issue. It is because the documentation is incorrect, and I had the same error.
This file is invalid:
# DB Version: 15
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 4 GB
# CPUs num: 2
# Data Storage: ssd
max_connections = 200
shared_buffers = 1GB
effective_cache_size = 3GB
maintenance_work_mem = 256MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 2621kB
min_wal_size = 1GB
max_wal_size = 4GB
Please rename this file and redeploy.
If it works then use a valid config (will go get one for you).
Correct file based off your settings above:
listen_addresses = '*'
dynamic_shared_memory_type = posix
log_timezone = 'UTC'
datestyle = 'iso, mdy'
timezone = 'UTC'
lc_messages = 'en_US.utf8' # locale for system error message
lc_monetary = 'en_US.utf8' # locale for monetary formatting
lc_numeric = 'en_US.utf8' # locale for number formatting
lc_time = 'en_US.utf8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
max_connections = 200
shared_buffers = 1GB
effective_cache_size = 3GB
maintenance_work_mem = 256MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 2621kB
min_wal_size = 1GB
max_wal_size = 4GB
Thanks, won’t have time to look into this until tomorrow morning, but will report back then.
Thanks, it worked perfectly! So the way I see it, these parts are mandatory?
listen_addresses = '*'
dynamic_shared_memory_type = posix
log_timezone = 'UTC'
datestyle = 'iso, mdy'
timezone = 'UTC'
lc_messages = 'en_US.utf8' # locale for system error message
lc_monetary = 'en_US.utf8' # locale for monetary formatting
lc_numeric = 'en_US.utf8' # locale for number formatting
lc_time = 'en_US.utf8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
Is this an issue with the lemmy docs or something LED specific?
Thats what was in the docker image conf file by default and when you replace it they are missing so I just put back in the default values I copied from the file. I imagine the values can be changed if you have a need but those were the default values from prior to adding in your custom file which wipes all the values that were in there already.
I am guessing lemmy docs is wrong but cant be 100% sure. First time using postgres personally.