Switch to acme.sh
buchdag opened this issue ยท 46 comments
As indicated there, a v2.0 version of letsencrypt-nginx-proxy-companion
using acme.sh
instead of simp_le
is being worked on. I'm opening this issue so we can discuss the potential non backward compatible changes introduced by this ACME client swap and how we should handle them.
For now the branch I'm working on is only available on my fork of the project : https://github.com/buchdag/docker-letsencrypt-nginx-proxy-companion/tree/acme.sh
First, here are the issues I encountered.
General use
- The first big difference between the two ACME clients is that while
simp_le
is stateless,acme.sh
is not. It works with a configuration directory and will generate configuration files no matter what. This mean that either a new docker volume will be needed to store those configurations, or that we'll have to put them somewhere into the already existing/etc/nginx/certs
.
Certificates
acme.sh
generate the certificates in its own certificate directory (defaulting to the config directory) with different, non configurable naming conventions thansimp_le
, then optionally copy them to the directory and file name of your choice. It uses its own certificate directory to keep track of what certificate have been generated and if they are still valid or not. If we want to migrate existing certs to acme.sh, we'll have to put them in its cert folder with the same naming convention and re-create the "status file" it uses to keep track of each cert status. But the bigger issue here is that we didn't even have a way to say for sure if a cert inside/etc/nginx/certs
was handled by the companion until very recently, so making such a migration automatic might not even be possible. And we don't even have a way for the container to know its own version yet.acme.sh
makes no distinction between certificates issued from different ACME endpoints. If you ask for a cert from the production endpoint, then for a cert covering the same base domain from the staging endpoint, the first cert will be overwritten. The same goes with V1 <> V2 endpoints.
ACME accounts
acme.sh
andsimp_le
store the ACME account keys in a different way :simp_le
encode the public and the private key in JWK format on a single file and store just those two infos whileacme.sh
encode the private key in PEM format, the public key in JWK format inside a JSON file and additionally store the account id, contact address, creation IP, creation date and account status inside the JSON.- Incidentally, they don't handle contact email the same way.
acme.sh
is using a per-config directory email address which has to be configured prior to certificate issuance whilesimp_le
can take a different address at each invocation. BTW @cpu if you read this, I'm unsure about the way this is handled on the ACME side, are the contact address(es) only stored at the account level and written over if modified, or do you store a potentially different set of contact address(es) for each generated certificate ?
Minor issues
- the
DEBUG
environment variable is used by acme.sh too but it expect a numeric value, so we can't usetrue
anymore. - when issuing a certificate,
acme.sh
prints it in PEM format on std output. It might be a matter of personal taste but I find it to be rather unclean and not fit for a non interactive process. - by default
acme.sh
keep the same private key when renewing a certificate whilesimp_le
has the inverse behaviour.
I'll detail the way I'm currently handling each of these issues later today or in the few next days.
BTW @cpu if you read this, I'm unsure about the way this is handled on the ACME side, are the contact address(es) only stored at the account level and written over if modified, or do you store a potentially different set of contact address(es) for each generated certificate ?
It's the former: contact address(es) are properties of the ACME account resource. There's no way specified in the protocol draft to set a contact address for a specific order, authorization, or issuance. Any updates to the contact information would be handled as an account update.
Hope that helps!
In the current acme.sh
based version I've got (which pass all tests and is currently used on one of my servers), I did the following to address each issue:
-
the image comes preconfigured to use a default configuration directory at
/etc/acme.sh/default
, with/etc/acme.sh
being defined as a volume in the Dockerfile. I personally don't think ACME accounts andacme.sh
config should be stored in/etc/nginx/certs
or be shared with the internet facing proxy container. My thought is we'll probably be forced to break backward compatibility for some stuff anyway, so we should as well correct design issues like this one along the way. -
acme.sh
puts it certificate in/etc/acme.sh/default
then copy them with the same naming convention assimp_le
in the/etc/nginx/certs
folder. Proceeding like this require the least amount of modification on the existing service loop and test suite code. I haven't yet worked on migrating existing cert and I highly doubt that the process could be made automatic because of what I said previously. -
I haven't covered that yet but I think it might be easily addressed by using multiple cert subdirectories in
/etc/acme.sh/default
based on the ACME endpoint used. -
for now my solution is not caring about migrating ACME account keys and just let
acme.sh
create its own account keys. Extracting private key in PEM format from a JWK is doable (did a proof of concept a few days ago) but I really don't think it's worth the effort plus not trying to import account keys from another client is kind of the recommended solution when using certbot anyway, so I don't think this will change. -
the container use as a
DEFAULT_EMAIL
environment variable that set the contact email address for the default ACME account. @cpu answer implies that the way the containers have been doing it for a while does not actually work (= using a single account and being able to specify a per container/certificate email address). I don't remember exactly howsimp_le
handle this internally, but I suspect the email address used is the one supplied when the account creation happens for the first time. -
this is going to be adresses by an upcoming separate PR that rework the logging system, and
DEBUG=true
will automatically be converted to the relevant environment variable and value. -
we could pipe
acme.sh
output togrep
orsed
to try to strip out the certificate from the std out but that's not a super clean solution. Or we could try to contribute a 'non interactive' output mode toacme.sh
. -
the environment variable
REUSE_PRIVATE_KEYS
defaulting tofalse
has been replaced by theRENEW_PRIVATE_KEYS
defaulting totrue
but it's probably better to keep the old environment variable even is theacme.sh
command line flag is actually the inverse.
@cpu if you have 5~10 minutes to spare, would you happen to have an advice on the 4th point above ?
@buchdag I agree its probably not worth the trouble of migrating existing account keys. It might be more important one day when acme-caa is enabled in production and users could have accounturi
restrictions but today that isn't the case (except in staging). The only other case I see where folks have a hard time transitioning to a new account key is if they have coordinated with Let's Encrypt for a rate limit adjustment. Those are often done by ACME account ID as well. Do you have a sense of whether your users might be in this position?
I don't remember exactly how
simp_le
handle this internally, but I suspect the email address used is the one supplied when the account creation happens for the first time.
That is what I wanted to talk about previously.
I closely followed the debug log of simp_le
along with the official ACME protocol and noticed that the request that simp_le
sends for the email is invalid, and the ACME response is silently ignored.
So i.e. the LETSENCRYPT_EMAIL
never actually worked!
(As you said, the very first LETSENCRYPT_EMAIL
mail is used for all certificates on the same endpoint)
The global DEFAULT_EMAIL
would be the only working way.
Generally I agree with all design decisions, but I will investigate closer at a latter time.
The only other case I see where folks have a hard time transitioning to a new account key is if they have coordinated with Let's Encrypt for a rate limit adjustment. Those are often done by ACME account ID as well. Do you have a sense of whether your users might be in this position?
If I remember my digging correctly, simp_le
doesn't store the account ID in any way. So the only way for the user to get the ID with letsencrypt-companion
would be to dig through the debug logs and find the ACME response.
So the average user would be excluded from that.
I'm brand new to this project, but I'd be interested in helping out, if you're still looking for help.
Where did you end up with the migration?
Hi @buchdag,
I've tried the 'acme.sh' branch of your fork with some succes after rebasing it on v1.12.1 plus these changes :
- use acme.sh v2.8.6 (v2.8.0 doesn't work anymore)
- install acme.sh twice:
- once with config-home=/etc/acme.sh/staging
- once with config-home=/etc/acme.sh/default
- update letsencrypt_service script to use the 'staging' config when LETSCENCRYPT_TEST=true
Any chance to have your branch rebased as well so anyone working on it could submit pull requests?
Thanks,
_g.
@pini-gh
you can try my fork here:
https://github.com/Neilpang/nginx-proxy
Also it has a publish docker image: neilpang/nginx-proxy
@Neilpang
It's neat but it doesn't suit my needs. I want to be able to issue certificates for wildcard domains. I'm not there yet, but working on it.
I now have a working configuration with DNS mode enabled. It needs polishing and corner cases testing, but it does work for basic usage.
@pini-gh I have testing your fork in dns_ali mode. My nginx-proxy compainon's configuration is:
OPT=/opt/nginx-proxy
docker run -d --name nginx-proxy \
-p 80:80 \
-p 443:443 \
-v /var/run/docker.sock:/tmp/docker.sock \
-v $OPT/log:/var/log/nginx \
-v $OPT/dhparam:/etc/nginx/dhparam \
-v $OPT/certs:/etc/nginx/certs \
-v $OPT/conf.d:/etc/nginx/conf.d \
-v $OPT/vhost.d:/etc/nginx/vhost.d \
-v $OPT/html:/usr/share/nginx/html \
-e "TZ=Asia/Shanghai" \
-e "VIRTUAL_HOST=foo.example.com" \
-e "VIRTUAL_UPSTREAM=192.168.8.10:8080,192.168.8.10:80 down" \
-e "HTTPS_METHOD=noredirect" \
-e "LETSENCRYPT_HOST=fo.example.com" \
-e "LETSENCRYPT_DNS_MODE=dns_ali" \
-e "LETSENCRYPT_DNS_MODE_SETTINGS=export Ali_Key=xxxxxx; export Ali_Secret=xxxxxxx;" \
--label com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy \
--restart always \
ccpaging/nginx-proxy:alpine
It is working with my nginx-proxy with VIRTUAL_UPSTREAM support.
And I add Dockerfile.alpine
and install_acme_master.sh
in my fork:
https://github.com/ccpaging/docker-letsencrypt-nginx-proxy-companion
Can I make a pull request to your fork?
Can I make a pull request to your fork?
I've had a look at your commit. As I understand it it brings two changes:
- A one-stage only Dockerfile installing docker-gen from its binary release instead of compiling it from source
- Install acme.sh from the head of its master branch
I'm not fond of the former as it brings nothing new to the end user.
For the latter no need to add yet another script. I would consider a patch using a Docker build arg to override the default tree-ish to checkout.
I've had a look at your commit. As I understand it it brings two changes:
- A one-stage only Dockerfile installing docker-gen from its binary release instead of compiling it from source
- Install acme.sh from the head of its master branch
Yes. Accessing github.com may be slower and unstable sometime or somewhere. This companion should have Dockfile.alpine just like nginx-proxy. It is not urgent.
@pini-gh sorry for the lack of answer, been busy on other stuff for a while.
A rebased acmesh
branch will be added to this repo (not my fork) asap.
@ccpaging going back to fetching docker-gen
binaries from GitHub isn't planned as we want to retain multi arch capabilities through muti stage builds.
Installing acme.sh
from the head of the master
branch instead of a release tag isn't planned either, but @pini-gh idea of a Docker build arg is conceivable.
This companion should have Dockfile.alpine just like nginx-proxy. It is not urgent.
The companion is already based on Alpine.
The rebased acme.sh
branch has been pushed to this repo.
The Docker image is available for test : jrcs/letsencrypt-nginx-proxy-companion:acmesh
@pini-gh your design decisions seems pretty sensible to me and I'd like to merge them to the branch but the DNS stuff will have to wait just a little. Would it be ok to you if I committed those changes to the branch with you as the author ?
@buchdag please do, and let me know when it's done so I could rebase my work on your branch.
@buchdag I've just setup branch acme.sh-pini on top of your current acme.sh branch. This should ease cherry-picking from there.
Many thanks !
If possible, could you rework fca180c a bit so that I could directly cherry pick it ?
- on your commit
README.MD
got rolled back to an older, outdated version. - this block is not required on
entrypoint.sh
andletsencrypt_service
as the first action of both is sourcingfunctions.sh
, which starts with thelc()
definition then the same block :
if [[ ${DEBUG:-} == true ]]; then
DEBUG=1 && export DEBUG
fi
accountemail="${!accountemail_varname}"
should beaccountemail="${!accountemail_varname:-"<no value>"}"
to correctly handle environment variable set to an empty value.- rather than
${accountemail:+--accountemail ${accountemail}}
i'd prefer adding the argument to theparams_d_arr
array to stay in line with the existing code. - and last could you use the staging config without email wether staging is going to be used because the
LETSENCRYPT_TEST
variable or the two (global and per proxied container)ACME_CA_URI
variables ? Thats means that$ACME_CA_URI
andLETSENCRYPT_${cid}_ACME_CA_URI
both have to be checked against a regexp in addition to checkingLETSENCRYPT_${cid}_TEST
fortrue
.
Let me know if this is possible for you, and if not: do you mind if I co-author the commit to change this ?
Done. Please tell me in case I missed something.
Hmmm the certs_standalone
test with cherry picked 8b2c6ee is failing : https://travis-ci.com/github/buchdag/docker-letsencrypt-nginx-proxy-companion/builds/189210442
I don't really get why this test specifically and the function that fetch the relevant companion logs in case of CI failures fails for this test too, yay.
There is a typo in line 175 of app/letsencrypt_service: <no value>
should read <no-value>
. Sorry about that.
Update: this is the other way around: actually the typo is line 174 where no-value
should read no value
.
And another typo line 182: acme_ca_uri="ACME_CA_TEST_URI"
should read acme_ca_uri="$ACME_CA_TEST_URI"
. Sorry again.
@pini-gh I'm re-committing your changes in smaller increments and adding tests (+ doc), I'll force push the branch to this repo when things will be clean enough.
6673436 : I don't think --auto-upgrade
, --nocron
and --noprofile
are even evaluated when acme.sh
is called without the --install
flag.
4c9664d : if I understood correctly the http01
challenge was entirely removed by this commit ?
Could you tell me more about eedabe1 ? The .companion
file was precisely added to avoid overwriting certificates that aren't handled by the companion container.
6673436 : I don't think
--auto-upgrade
,--nocron
and--noprofile
are even evaluated whenacme.sh
is called without the--install
flag.
Indeed, you're right.
4c9664d : if I understood correctly the
http01
challenge was entirely removed by this commit ?
Not sure what you mean. If it is about removing the location_configuration
bits, then yes. My opinion is that the LE companion shouldn't need write access to /etc/nginx/vhost.d/
. Instead nginx-proxy can easily set up the challenge blocks where HTTPS is wanted. BTW it does handle a couple of cases already. I just had to fix another couple in my nginx-proxy fork.
Could you tell me more about eedabe1 ? The
.companion
file was precisely added to avoid overwriting certificates that aren't handled by the companion container.
I may have misunderstood the .companion
file's role. The correct fix for the problem I wanted to fix then might be a6c8f2d.
I've pushed and updated acme.sh
branch to this repo, the jrcs/letsencrypt-nginx-proxy-companion:acmesh
image is available on DockerHub too.
Not sure what you mean. If it is about removing the
location_configuration
bits, then yes. My opinion is that the LE companion shouldn't need write access to/etc/nginx/vhost.d/
. Insteadnginx-proxy
can easily set up the challenge blocks where HTTPS is wanted. BTW it does handle a couple of cases already. I just had to fix another couple in mynginx-proxy
fork.
Ok, I mistakenly thought that due to this commit the container wasn't supporting http01
ACME challenge anymore but only dns01
.
I tend to agree with you on the location_configuration
stuff, it is messy and prone to unidentified race conditions, but that was made before any ability to handle ACME challenges was built into upstream nginx.tmpl
(and is still required if we want to support people that use an nginx-proxy
image or nginx.tmpl
predating this change.
Could you tell more about what cases are now correctly handled by the upstream nginx-proxy
and what cases you fixed on your fork (and how) ?
Could you tell more about what cases are now correctly handled by the upstream
nginx-proxy
and what cases you fixed on your fork (and how) ?
Please see https://github.com/pini-gh/nginx-proxy/tree/acme-challenge.
You may be interested in pini-gh/nginx-proxy@8e173b7 as well.
I've pushed and updated
acme.sh
branch to this repo, thejrcs/letsencrypt-nginx-proxy-companion:acmesh
image is available on DockerHub too.
I've now rebased my acme.sh-pini
branch on your acme.sh
.
Could you tell more about what cases are now correctly handled by the upstream
nginx-proxy
and what cases you fixed on your fork (and how) ?Please see https://github.com/pini-gh/nginx-proxy/tree/acme-challenge.
You may be interested in pini-gh/nginx-proxy@8e173b7 as well.
To be honest the docker-gen
templating language in general is a bit difficult to digest, and generating nginx configuration with it even less so, could you walk me through the logic ? What was causing failed authorisations and how did you solve it ?
Well, the main point is this change in pini-gh/nginx-proxy@b73ffd8:
-{{ $is_https := (and (ne $https_method "nohttps") (ne $cert "") (exists (printf "/etc/nginx/certs/%s.crt" $cert)) (exists (printf "/etc/nginx/certs/%s.key" $cert))) }}
+{{ $is_https := (ne $https_method "nohttps") }}
The template should use HTTPS mode wherever the user explicitly stated he wants HTTPS.
Then the acme-challenge snippet is generated for the cases where the certificate doesn't exist yet as well.
This is what can be induced from the code and comments that follow:
{{ if $is_https }}
+{{/* No cert? But we do want HTTPS! Let's fallback on the nginx default one.
+ And set up acme-challenge so that Let's Encrpyt can do its job */}}
+{{ $cert := coalesce $cert "default" }}
+{{/* Very same reasoning as above */}}
+{{ $cert := when (and (exists (printf "/etc/nginx/certs/%s.crt" $cert)) (exists (printf "/etc/nginx/certs/%s.key" $cert))) $cert "default" }}
This part makes also the server use the default certificate when the dedicated one doesn't exist yet.
The next commit, pini-gh/nginx-proxy@f1baeda handles the case where the user set HTTPS_METHOD=noredirect
. This case involves another branch of the template, guarded with this test:
{{ if or (not $is_https) (eq $https_method "noredirect") }}
The other branch which has the acme-challenge block already is guarded with:
{{ if $is_https }}
...
{{ if eq $https_method "redirect" }}
hence doesn't cover the HTTPS_METHOD=noredirect
case.
Finally, commit pini-gh/nginx-proxy@537334f just adds a few comments into the generated nginx config file.
I've now rebased my
acme.sh-pini
branch on youracme.sh
.
Thanks !
3aee0a1 will be worked on a bit later as I intend v2.0.0
to target http-01
challenge only. The removal of the location_configuration
stuff won't be merged as-is, as the objective is still compatibility with upstream nginx-proxy
.
3183301 and c5b5d62 won't be merged until those features make it to upstream nginx-proxy
for the same reason.
51609ac will probably be merged after v2.0.0
is released.
Instead of 1758a52 I was thinking of invoking acme.sh --register-account
inside update_cert()
, any feedback on that ?
Instead of 1758a52 I was thinking of invoking
acme.sh --register-account
insideupdate_cert()
, any feedback on that ?
No strong opinion about that, provided that it does the trick: preventing an override of the configuration with previous settings.
Okay I think I got the acme.sh
branch pretty much where I wanted it for v2.0.0
, bar a few updates to the docs and maybe a few more tests and/or updates to existing tests.
Great! Looking forward to rebase on v2.0.0 :)
I tried digging into code and the related issues, but it was not clear to me:
Is this implementation (v2.0.1) capable of dns01
challenges? Or is it still http01
challenges only? I'd be an awesome feature to have dns challenges implemented, specially for users selfhosting at home, on ISPs that block ports 80 and 443. Using alternative ports such as 8080 or 8443 kills the http01
challenges.
Is this implementation (v2.0.1) capable of
dns01
challenges?
No, it isn't yet. But this is planned, as I undertand it. Meanwhile you may want to try my fork which has dns01
challenge implemented.
Meanwhile you may want to try my fork
It does work indeed! Minor intervention on my ansible roles. Thank you for the pointer, and for the great work!
Thank you for the feedback!
I still get too many certificates error even after I added --volume /etc/acme.sh
string to the jrcs/letsencrypt-nginx-proxy-companion
container.
How can I fix this?
I use exactly the same command as in the README:
docker run --detach \
--name nginx-proxy-letsencrypt \
--volumes-from nginx-proxy \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
--volume /etc/acme.sh \
--env "DEFAULT_EMAIL=mail@yourdomain.tld" \
jrcs/letsencrypt-nginx-proxy-companion
I also tried mount /etc/acme.sh
to the host, it has no effect.
I also tried to use simp_le
version v1.13.1
- this version produces the same errors too.
Once the limit is reached you have to wait a few days. In the meantime use LETSENCRYPT_TEST=true
and check that the resulting certificates are persistent.
Once the limit is reached you have to wait a few days. In the meantime use
LETSENCRYPT_TEST=true
and check that the resulting certificates are persistent.
Yes, I already waited a week and this problem does not disappear. I see in the logs that all certificates reissue each time when I reload the docker project. I also see different certificates in the browser. Finally, I just see those too many certificates
errors.
As I said, set LETSENCRYPT_TEST=true
until you fix the problem.
@CaliforniaMountainSnake --volume /etc/acme.sh
create an anonymous volume that will persist data over containers stop/start but which will be re-created if the container is deleted.
Use a named volume instead:
docker volume create acme
docker run --detach \
--name nginx-proxy-letsencrypt \
--volumes-from nginx-proxy \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
--volume acme:/etc/acme.sh \
--env "DEFAULT_EMAIL=mail@yourdomain.tld" \
jrcs/letsencrypt-nginx-proxy-companion
@buchdag, thank you, with named volumes the certificates are persisted!