offen/docker-volume-backup

No custom command executed

Closed this issue · 8 comments

Describe the bug
On one of my stacks (Docker Swarm) custom commands are not being executed.
It works perfectly on other stacks, and I cannot figure where is the difference.

I tried using the two methods (scale down service VS stop container).
Both work without any problem, but no custom command is executed.

docker-compose.yml

version: "3"

services:

  traccar:
    image: traccar/traccar:6
    labels:
      # Allow container stop during backup
      - "docker-volume-backup.stop-during-backup=traccar3"
      - "docker-volume-backup.process-pre=/bin/sh -c 'curl -m 10 --retry 5 https://health.domain.com/ping/6e0e127d-553b-495b-b890-b254b78b662f'"
      - "docker-volume-backup.exec-label=traccar3"

  backup:
    image: offen/docker-volume-backup:v2
    restart: always
    environment:
      BACKUP_FILENAME: backup-%Y-%m-%dT%H-%M-%S.tar.gz
      BACKUP_PRUNING_PREFIX: backup-
      BACKUP_RETENTION_DAYS: "365"
      BACKUP_CRON_EXPRESSION: "*/5 * * * *"
      BACKUP_STOP_DURING_BACKUP_LABEL: "traccar3"
      EXEC_LABEL: "traccar3"
    volumes:
      - data:/backup/my-app-backup/data:ro
      - database:/backup/my-app-backup/database:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /nas-storage/traccar/full:/archive

volumes:
  data:
  database:

When using :

- "docker-volume-backup.archive-pre
- "docker-volume-backup.copy-post

Only the archive-pre is run.
No way to get the copy-post run.

More info, that only happens on this stack.
Other stacks are fine.

Do you know how I could debug this issue ?

m90 commented

More info, that only happens on this stack.

Are all of these stacks single node swarm clusters or are some (or all of them) composed of multiple nodes?

No way to get the copy-post run.

Does it just not run or is there an error? Did you try setting EXEC_FORWARD_OUTPUT to true?

Are all of these stacks single node swarm clusters or are some (or all of them) composed of multiple nodes?

All these stacks are in a Docker Swarm of multiple nodes.
But all stacks are deployed in a single node (manager).

Does it just not run or is there an error? Did you try setting EXEC_FORWARD_OUTPUT to true?

When using archive-pre it works, and outputs logs to backup container logs.
When using others like copy-post nothing appears.

I just tried now setting EXEC_FORWARD_OUTPUT does not change anything.

Works only with archive-pre.
archive-post, copy-pre and archive-pre do not work.

What is weird is that they work on other stacks.

m90 commented

Considering archive-pre is the earliest stage at which you can run commands, does archive creation work successfully for your schedule? In case the backup run fails at archive creation, no commands following that stage will be run. Could you maybe paste the log output of such a run here?

Considering archive-pre is the earliest stage at which you can run commands, does archive creation work successfully for your schedule? In case the backup run fails at archive creation, no commands following that stage will be run. Could you maybe paste the log output of such a run here?

time=2024-09-30T20:06:51.866Z level=INFO msg="Successfully scheduled backup from environment with expression 0 2 * * *"
time=2024-10-01T02:00:00.103Z level=INFO msg="Now running script on schedule 0 2 * * *"
time=2024-10-01T02:00:00.145Z level=INFO msg="Running docker-volume-backup.archive-pre command /bin/sh -c 'wget https://health.domain.com/ping/6e0e127d-553b-495b-b890-b254b78b662f -T 10 -t 5 -O /dev/null' for container traccar3_traccar.1.wc8tgz2mgb830gpj43o862z0z"
Connecting to health.domain.com ({public_ip}:443)
saving to '/dev/null'
null                 100% |********************************|     2  0:00:00 ETA
'/dev/null' saved
time=2024-10-01T02:00:00.500Z level=INFO msg="Stopping 2 out of 110 running container(s) as they were labeled docker-volume-backup.stop-during-backup=traccar3."
time=2024-10-01T02:00:00.500Z level=INFO msg="Scaling down 0 out of 123 active service(s) as they were labeled docker-volume-backup.stop-during-backup=traccar3."
time=2024-10-01T02:00:04.125Z level=INFO msg="Created backup of `/backup` at `/tmp/backup-2024-10-01T02-00-00.tar.gz`."
time=2024-10-01T02:00:04.221Z level=INFO msg="Restarted 2 container(s)."
time=2024-10-01T02:00:04.221Z level=INFO msg="Scaled 0 service(s) back up."
time=2024-10-01T02:00:04.437Z level=INFO msg="Stored copy of backup `/tmp/backup-2024-10-01T02-00-00.tar.gz` in `/archive`." storage=Local
time=2024-10-01T02:00:04.475Z level=INFO msg="None of 37 existing backups were pruned." storage=Local
time=2024-10-01T02:00:04.480Z level=INFO msg="Removed tar file `/tmp/backup-2024-10-01T02-00-00.tar.gz`."

If I switch to (using EXEC_FORWARD_OUTPUT: "true") : docker-volume-backup.copy-post

time=2024-10-01T10:10:49.990Z level=INFO msg="Successfully scheduled backup from environment with expression */5 * * * *"
time=2024-10-01T10:15:00.097Z level=INFO msg="Now running script on schedule */5 * * * *"
time=2024-10-01T10:15:00.161Z level=INFO msg="Stopping 2 out of 110 running container(s) as they were labeled docker-volume-backup.stop-during-backup=traccar3."
time=2024-10-01T10:15:00.161Z level=INFO msg="Scaling down 0 out of 123 active service(s) as they were labeled docker-volume-backup.stop-during-backup=traccar3."
time=2024-10-01T10:15:02.511Z level=INFO msg="Created backup of `/backup` at `/tmp/backup-2024-10-01T10-15-00.tar.gz`."
time=2024-10-01T10:15:02.518Z level=INFO msg="Restarted 2 container(s)."
time=2024-10-01T10:15:02.518Z level=INFO msg="Scaled 0 service(s) back up."
time=2024-10-01T10:15:02.727Z level=INFO msg="Stored copy of backup `/tmp/backup-2024-10-01T10-15-00.tar.gz` in `/archive`." storage=Local
time=2024-10-01T10:15:02.734Z level=INFO msg="None of 38 existing backups were pruned." storage=Local
time=2024-10-01T10:15:02.738Z level=INFO msg="Removed tar file `/tmp/backup-2024-10-01T10-15-00.tar.gz`."
m90 commented

I'm not entirely sure what's causing this and I have a hard time reproducing the issue. Your example in the OP does not work as the traccar image does not have curl installed, but it does work if i just run a random ls or similar.

Would it be possible for you to create a minimal reproducible compose file that demstrates the issue? Else I'm a bit lost in terms of where to help.

m90 commented

Closing this as it's inactive. If you have further hints on how to reproduce it, feel free to reopen.

FOUND IT !

So I was using either (service) labels or (service > deploy) labels.
It seems I needed to use a combination of both.

I mean in service traccar I need to use deploy>labels to stop the service (and not the container).
But custom_commands and exec-label need to be container labels.

Here is a docker-compose example that works perfectly :

version: "3"

services:

  traccar:
    image: traccar/traccar:6
    restart: on-failure
    ports:
      - "5055:5055"
      - "5144:5144"
      - "5144:5144/udp"
      - "5093:5093"
      - "5093:5093/udp"
    volumes:
      - logs:/opt/traccar/logs
      - data:/opt/traccar/data:rw
      - /path/traccar.xml:/opt/traccar/conf/traccar.xml:ro
    networks:
      - default
    labels:
      - "docker-volume-backup.archive-pre=/bin/sh -c 'wget https://healthcheck.domain.com/ping/00000000-0000-0000-0000-000000000000 -T 10 -t 5 -O /dev/null'"
      - "docker-volume-backup.copy-post=/bin/sh -c 'wget https://health.appbox.camacho.pt/ping/00000000-0000-0000-0000-000000000000 -T 10 -t 5 -O /dev/null'"
      - "docker-volume-backup.exec-label=traccar_app"
    deploy:
      labels:
        # Allow container stop during backup
        - "docker-volume-backup.stop-during-backup=true"

  database:
    image: mariadb
    restart: always
    environment:
      MARIADB_ROOT_PASSWORD: mypassword
      MARIADB_DATABASE: traccar
      MARIADB_USER: traccar
      MARIADB_PASSWORD: mypassword
    volumes:
      - database:/var/lib/mysql:Z
    networks:
      - default
    deploy:
      labels:
        # Allow container stop during backup
        - "docker-volume-backup.stop-during-backup=true"

  backup:
    image: offen/docker-volume-backup:v2
    restart: always
    environment:
      BACKUP_FILENAME: backup-%Y-%m-%dT%H-%M-%S.tar.gz
      BACKUP_PRUNING_PREFIX: backup-
      BACKUP_RETENTION_DAYS: "365"
      # Everyday at 02:00
      BACKUP_CRON_EXPRESSION: "0 2 * * *"
      # Stop containers/services during backup
      EXEC_LABEL: "traccar_app"
      EXEC_FORWARD_OUTPUT: "true"
    volumes:
      - data:/backup/my-app-backup/data:ro
      - database:/backup/my-app-backup/database:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /path/data-backups/traccar/full:/archive

volumes:
  data:
  database:
  logs:
  
networks:
  default:

Thanks a lot @m90 for your time.