/docker-autoheal

Monitor and restart unhealthy docker containers.

Primary LanguageShellMIT LicenseMIT

Docker Autoheal

Monitor and restart unhealthy docker containers. This functionality was proposed to be included with the addition of HEALTHCHECK, however didn't make the cut. This container is a stand-in till there is native support for --exit-on-unhealthy moby/moby#22719.

Supported tags and Dockerfile links

How to use

UNIX socket passthrough

docker run -d \
    --name autoheal \
    --restart=always \
    -e AUTOHEAL_CONTAINER_LABEL=all \
    -v /var/run/docker.sock:/var/run/docker.sock \
    willfarrell/autoheal

TCP socket

docker run -d \
    --name autoheal \
    --restart=always \
    -e AUTOHEAL_CONTAINER_LABEL=all \
    -e DOCKER_SOCK=tcp://HOST:PORT \
    -v /path/to/certs/:/certs/:ro \
    willfarrell/autoheal

a) Apply the label autoheal=true to your container to have it watched.

b) Set ENV AUTOHEAL_CONTAINER_LABEL=all to watch all running containers.

c) Set ENV AUTOHEAL_CONTAINER_LABEL to existing label name that has the value true.

Note: You must apply HEALTHCHECK to your docker images first. See https://docs.docker.com/engine/reference/builder/#healthcheck for details. See https://docs.docker.com/engine/security/https/ for how to configure TCP with mTLS

The certificates, and keys need these names:

  • ca.pem
  • client-cert.pem
  • client-key.pem

Change Timezone

If you need the timezone to match the local machine, you can map the /etc/localtime into the container.

docker run ... -v /etc/localtime:/etc/localtime:ro

ENV Defaults

AUTOHEAL_CONTAINER_LABEL=autoheal
AUTOHEAL_INTERVAL=5   # check every 5 seconds
AUTOHEAL_START_PERIOD=0   # wait 0 seconds before first health check
AUTOHEAL_DEFAULT_STOP_TIMEOUT=10   # Docker waits max 10 seconds (the Docker default) for a container to stop before killing during restarts (container overridable via label, see below)
DOCKER_SOCK=/var/run/docker.sock   # Unix socket for curl requests to Docker API
CURL_TIMEOUT=30     # --max-time seconds for curl requests to Docker API
WEBHOOK_URL=""    # post message to the webhook if a container was restarted (or restart failed)
WEBHOOK_JSON_KEY="text"    # the key to use for the message in the json body of the request to the webhook url

Optional Container Labels

autoheal.stop.timeout=20        # Per containers override for stop timeout seconds during restart

Testing

docker build -t autoheal .

docker run -d \
    -e AUTOHEAL_CONTAINER_LABEL=all \
    -v /var/run/docker.sock:/var/run/docker.sock \
    autoheal