swarm-certbot-traefik

Traefik Proxy community edition does not really support Let's Encrypt in a serious way for docker swarm. If you have multiple instances of traefik with letsencrypt support enabled, they would all start to generate same certificates, overwriting acme.json storage and exhausting the limits very quickly if things go wrong.

This project handles certificates using separate service (called certbot in provided stack YAML files), which exports file with certificates in format expected by traefik. It uses auto-discovery by searching for certbot.domain labels. Please check following examples which shows both traefik and certbot labels:

    vector:
        image: timberio/vector:0.36.1-alpine
        networks:
            - web
        deploy:
            labels:
                - "traefik.enable=true"
                - "traefik.docker.network=web"
                - "traefik.http.routers.myproject-vector.entrypoints=websecure"
                - "traefik.http.routers.myproject-vector.rule=Host(`sink.example.com`)"
                - "traefik.http.services.myproject-vector.loadbalancer.server.port=8687"
                - "certbot.domain=sink.example.com"

Example stacks

Two example docker swarm stacks are provided:

  • manager_single_stack.yml - for single traefik instance,
  • manager_multiple_stack.yml - with two main traefik instances (to support rolling update) running on the same host, in this case there is an edge traefic instance added which acts as a TCP proxy, in this setup, it is expected, that there is only one public IP without load balancing.

Functionality

  • Traefik dashboard is protected by basic authentication.
  • Generic redirect from 80 to 443 is provided (with exception of ACME challenge request).
  • Dynamic loading of generated certificates - Treafik actually requires TLS to be in a dynamically loaded file.
  • Challenge requests get automatically routed by Traefik - the server serving webroot directory is only started when needed.
  • Renewal is performed once in a day, when date change is detected. You can force from outside using:
SERVICE_NAME=manager_certbot; docker exec --tty $(docker ps --format json | jq -r 'select(.Names | startswith("'$SERVICE_NAME'")) | .ID') ./renew.sh

Configuration

Use certbot.domain label - you can separate multiple domains with commas. No attempt to reuse traefik labels was done, there might be scenarios where traefik uses wildcards, but certbot needs to know the names:

    labels:
        - "traefik.enable=true"
        - "traefik.docker.network=web"
        - "traefik.http.routers.myproject-nginx.entrypoints=websecure"
        - "traefik.http.routers.myproject-nginx.rule=HostRegexp(`(.*).example.com`)"
        - "traefik.http.services.myproject-nginx.loadbalancer.server.port=80"
        - "certbot.domain=admin.example.com,www.example.com"

When using provided stacks please change following:

  • Replace REPLACE-PASSWORD with your password generated by htpasswd -nb.
  • Dashboard itself is covered by created certificate - replace traefik.example.com with your dashboard domain.
  • Replace hostmaster@example.com environment variable with your email for Let's Encrypt registration.

Important

  • When certbot fails to generate certificate it would store log into /etc/letsencrypt/failed/$DOMAIN - you have to delete it to get another attempt.

Credits

The code is based on this proof of concept from @bluepuma77.