eugene-khyst/letsencrypt-docker-compose

certbot container fails with an error

Closed this issue · 11 comments

I tried to execute the script with the option Multiple Docker Compose projects, because on that server I have Keycloak and PostgreSQL running in other containers.

? What's your domain name (e.g. example.com)? xxx.xx.xx
? What's your email for registration and recovery contact? xxx@xxx.xx
? Want to have 'www' subdomain (e.g. www.example.com)? No
? Want to obtain a test certificate from a staging server? Yes
? What is the RSA key size in bits? 4096
? How do you want to configure Nginx? Reverse proxy
? Does the upstream server run as a Docker container on the same host? Yes
? What is the address of the proxied server (e.g. example-backend:8080)? keycloak_demo:8080
? Enable WebSocket proxying? No
? Want to add another domain? No
? What is the DH parameters size in bits? 2048
? Use Gzip? No
? Are the entered data correct? Yes

However the certbot container fails to start.

dependency failed to start: container letsencrypt-docker-compose-certbot-1 is unhealthy

Certbot container logs are required to try to find out what is the problem: docker compose logs certbot

Sure!

letsencrypt-docker-compose-certbot-1  | Obtaining the certificate for domain key.xxx.xx
letsencrypt-docker-compose-certbot-1  | Testing on staging environment enabled
letsencrypt-docker-compose-certbot-1  | Using email xx@xxx.xx
letsencrypt-docker-compose-certbot-1  | RSA key size is 4096
letsencrypt-docker-compose-certbot-1  | Saving debug log to /var/log/letsencrypt/letsencrypt.log
letsencrypt-docker-compose-certbot-1  | Plugins selected: Authenticator webroot, Installer None
letsencrypt-docker-compose-certbot-1  | Account registered.
letsencrypt-docker-compose-certbot-1  | Requesting a certificate for key.xxx.xx
letsencrypt-docker-compose-certbot-1  | Performing the following challenges:
letsencrypt-docker-compose-certbot-1  | http-01 challenge for key.xxx.xx
letsencrypt-docker-compose-certbot-1  | Using the webroot path /var/www/certbot/key.xxx.xx for all unmatched domains.
letsencrypt-docker-compose-certbot-1  | Waiting for verification...
letsencrypt-docker-compose-certbot-1  | Challenge failed for domain key.xxx.xx
letsencrypt-docker-compose-certbot-1  | http-01 challenge for key.xxx.xx
letsencrypt-docker-compose-certbot-1  | 
letsencrypt-docker-compose-certbot-1  | Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
letsencrypt-docker-compose-certbot-1  |   Domain: key.xxx.xx
letsencrypt-docker-compose-certbot-1  |   Type:   connection
letsencrypt-docker-compose-certbot-1  |   Detail: 2x.xx.xx.xx: Fetching http://key.xxx.xx/.well-known/acme-challenge/FTlE3rx78GzcL0klOFXf7dmAnp-mxxx: Timeout during connect (likely firewall problem)
letsencrypt-docker-compose-certbot-1  | 
letsencrypt-docker-compose-certbot-1  | Hint: The Certificate Authority failed to download the temporary challenge files created by Certbot. Ensure that the listed domains serve their content from the provided --webroot-path/-w and that files created there can be downloaded from the internet.
letsencrypt-docker-compose-certbot-1  | 
letsencrypt-docker-compose-certbot-1  | Cleaning up challenges
letsencrypt-docker-compose-certbot-1  | Some challenges have failed.
letsencrypt-docker-compose-certbot-1  | Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

I don't think this may be relevant, but I use docker-compose. In cli.sh I just substituted docker compose.

And yes, I do have a firewall (IPFire). Currently I use the following setting:
A DNS record key.xxx.xx -> <firewall IP> -> <firewall IP>:80 -> <local server IP>:80
The local server is a virtual machine, which accessible from the network of the firewall. This redirection works fine for HTTP so far, i.e. if I call the DNS record I will land on the internal server.

Certbot failed to authenticate domain. Here is the error message:

Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
Detail: 2x.xx.xx.xx: Fetching http://key.xxx.xx/.well-known/acme-challenge/FTlE3rx78GzcL0klOFXf7dmAnp-mxxx: Timeout during connect (likely firewall problem)

Double check your DNS configuration.
Make sure that the website is accessible with self-signed certificates. Run the Docker Compose project in "Dry Run" mode (without running Certbot):

./cli.sh up --dry-run

or

DRY_RUN=true docker compose up -d

and then try to open any page of you website.

Browser will complain that certificate is self-signed, accept it and proceed to make sure that the page can be accessed.

The dry run passes without errors! I also can access the landing UI page of Keycloak (after accepting the exception), which is now https. However not the admin area. The browser logs say

Content-Security-Policy: The page's settings blocked the loading of a resource at http://key.xx.xx/realms/master/protocol/openid-connect/3p-cookies/step1.html

Please verify, that the data for Let's Encrypt ACME challenge can be accessed.

Run the project in the dry run mode (without actually running Certbot):

./cli.sh up --dry-run

Specify your domain:

domain=your.domain.com

Create a file that will emulate ACME challenge:

docker compose exec certbot mkdir -p /var/www/certbot/${domain}/.well-known/acme-challenge/
docker compose exec certbot sh -c "echo $(date) > /var/www/certbot/${domain}/.well-known/acme-challenge/test-challenge.txt"

Verify that the test file was created:

docker compose exec certbot cat /var/www/certbot/${domain}/.well-known/acme-challenge/test-challenge.txt

Now, verify that the Nginx is serving this test file emulating ACME challenge and it can be accessed:

curl http://${domain}/.well-known/acme-challenge/test-challenge.txt

The output of the curl command must be equal to the content of the test-challenge.txt file.

The last step failed

curl: (28) Failed to connect to key.xxx.xx port 80 after 131066 ms: Connection timed out

However if I enter the same URL in a browser it will correctly return the content of the txt, also through http. How can that be?

Mon Jul 17 09:53:09 AM UTC 2023

Here are the firewall (IPFire) logs of the unsuccessful attempt with curl:

13:21:03 	DNAT 	green0 	TCP 	10.0.0.5
23.xx.xxx.xx 	57440
80(HTTP) 		d2:74:7f:6e:37:xx
13:21:03 	FORWARDFW 	green0 	TCP 	10.0.0.5
10.0.0.5 	57440
80(HTTP) 		d2:74:7f:6e:37:xx
13:21:04 	FORWARDFW 	green0 	TCP 	10.0.0.5
10.0.0.5 	57440
80(HTTP) 		d2:74:7f:6e:37:xx
13:21:06 	FORWARDFW 	green0 	TCP 	10.0.0.5
10.0.0.5 	57440
80(HTTP) 		d2:74:7f:6e:37:xx
13:21:10 	FORWARDFW 	green0 	TCP 	10.0.0.5
10.0.0.5 	57440
80(HTTP) 		d2:74:7f:6e:37:xx
13:21:18 	FORWARDFW 	green0 	TCP 	10.0.0.5
10.0.0.5 	57440
80(HTTP) 		d2:74:7f:6e:37:xx

and the logs of a successful call from a browser:

13:59:46 	DNAT 	red0 	TCP 	46.xx.xxx.xx
23.xx.xxx.xx 	60351
443(HTTPS) 	DE		d2:74:7f:6e:37:xx
13:59:46 	FORWARDFW 	red0 	TCP 	46.xx.xxx.xx
10.0.0.5 	60351
443(HTTPS) 	DE		d2:74:7f:6e:37:xx
13:59:47 	DNAT 	red0 	TCP 	46.xx.xxx.xx
23.xx.xxx.xx 	60352
80(HTTP) 	DE		d2:74:7f:6e:37:xx
13:59:47 	FORWARDFW 	red0 	TCP 	46.xx.xxx.xx
10.0.0.5 	60352
80(HTTP) 	DE		d2:74:7f:6e:37:xx

Here the 4 entries are not always identical between subsequent calls. Sometimes it's https, sometimes http, sometimes both.

Check your network settings to see if there is a proxy server configured.

Can you access the http://${domain}/.well-known/acme-challenge/test-challenge.txt from the browser but from another device, for example smartphone?

I discovered that I had location / country blocking activated in the firewall. When I switched that off the http-01 challenge passed without error. So I completed Step 4 of the installation guide and the certificate was issued by STAGING. However still curl --insecure https://domain fails with a timeout error. Despite of that I switched to production env. Now the strange thing is that the certificate is self-signed (Issuer name Common Name = my domain). And again, curl domain fails with a timeout.

Yes, I tried http://${domain}/.well-known/acme-challenge/test-challenge.txtfrom different devices and networks and I can access it through the browser.

But what also works is curl http://${domain}/.well-known/acme-challenge/test-challenge.txtfrom another network! Only curl executed on the same server seems not to be possible. I don't know if this is an issue, maybe again some firewall setting. Or maybe because the server has no direct access to outside, only through the firewall rules. However, curl google.com works...

What remains is that the certificate in production mode is self-signed. Am I missing something?
OK, the problem was that there were some docker composecommands left. I changed to docker-composeand now the certificate is issued by let's encrypt!

docker-compose is V1 and not supported anymore. Consider updating to V2 docker compose.
Did you succeed in getting production Let's Encrypt certificates?

docker-compose is V1 and not supported anymore. Consider updating to V2 docker compose. Did you succeed in getting production Let's Encrypt certificates?

The server runs on Ubuntu and I choosed to install from docker.io, because this reasoning sounds convincing to me. And with that I have to use docker-compose.

Yes, I got a let's encrypt certificate. Then I also turned on a Cloudflare proxy and now the issuer of the certificate is sni.cloudflaressl.com. The certificate is verified by Cloudflare. I'm somewhat surprised that it works like that, but maybe I need to dive deeper to understand the process. Anyway, your hints were extremely valuable for finding the problem!