offen/docker-volume-backup

Automated backups not running when using multiple retention schedules

rurickdev opened this issue · 8 comments

What are you trying to do?

I'm trying to do backups automatically for day (3 days retention), week (1 month retention) and month (3 months retention)

What is your current configuration?

I basically follow the receipt from here https://offen.github.io/docker-volume-backup/how-tos/define-different-retention-schedules.html and here https://offen.github.io/docker-volume-backup/how-tos/run-multiple-schedules.html
with a couple for tweaks

# 01daily.conf
BACKUP_FILENAME="daily-backup-%Y-%m-%dT%H-%M-%S.tar.gz"
BACKUP_CRON_EXPRESSION="0 2 * * *"
BACKUP_PRUNING_PREFIX="daily-backup-"
BACKUP_RETENTION_DAYS="3"


# 02weekly.conf
BACKUP_FILENAME="weekly-backup-%Y-%m-%dT%H-%M-%S.tar.gz"
BACKUP_CRON_EXPRESSION="0 3 * * 1"
BACKUP_PRUNING_PREFIX="weekly-backup-"
BACKUP_RETENTION_DAYS="31"


# 03monthly.conf
BACKUP_FILENAME="monthly-backup-%Y-%m-%dT%H-%M-%S.tar.gz"
BACKUP_CRON_EXPRESSION="0 4 1 * *"
BACKUP_PRUNING_PREFIX="monthly-backup-"
BACKUP_RETENTION_DAYS="93"

And my Docker compose looks like this

services:
  backup:
    image: offen/docker-volume-backup:v2
    container_name: backup
    volumes:
      - /${XDG_RUNTIME_DIR}/docker.sock:/var/run/docker.sock:ro
      - ${BACKUP_DIR}:/archive
      - ${VOLUME_DIR}/backup/conf.d:/etc/dockervolumebackup/conf.d
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
    user: 0:0 # Running on rootless, if not set to 0:0 the write file fails due permissions
    restart: always

I use compose.override to add the desired mounts and tag the desired containers to stop, so I don't overpopulate this compose and this allows me to reuse the setup on other machines.

I confirmed that backup and my services are right tagged and drives mounted, and also trigger a first backup manually to ensure the config works.

Log output

This is the only logs the container gives me

time=2024-09-21T02:48:12.387-06:00 level=INFO msg="Successfully scheduled backup 01daily.conf with expression 0 2 * * *"
time=2024-09-21T02:48:12.387-06:00 level=INFO msg="Successfully scheduled backup 02weekly.conf with expression 0 3 * * 1"
time=2024-09-21T02:48:12.387-06:00 level=INFO msg="Successfully scheduled backup 03monthly.conf with expression 0 4 1 * *"
time=2024-09-23T09:58:46.182-06:00 level=INFO msg="Successfully scheduled backup 01daily.conf with expression 0 2 * * *"
time=2024-09-23T09:58:46.193-06:00 level=INFO msg="Successfully scheduled backup 02weekly.conf with expression 0 3 * * 1"
time=2024-09-23T09:58:46.209-06:00 level=INFO msg="Successfully scheduled backup 03monthly.conf with expression 0 4 1 * *"

Additional context

Even when the log says the backup configs are schedule, the backups seems to not run, on the /archive directory there are no new backup files.

If I run the backups manually following this https://offen.github.io/docker-volume-backup/how-tos/manual-trigger.html for every one of my configuration the backups are created as expected on the /archive mounted volume, meaning that the configuration works but for some reason the cronjob is not trigger.

Solution:

I used by mistake the user: 0:0 property on the compose and that prevented the trigger of the cron jobs, after removing that the backups where actually happening.

m90 commented

Your configuration looks correct, so I am not sure yet what it is that could be causing this.

Does the backup container keep running after it has logged the lines you pasted or does it somehow exit or restart or anything unexpected?

Yeah, the container is always running, I can run the manual backups at will, and portainer always shows the container as healthy.

Don't know if there is any flag that I can enable to get more detailed logs, or if docker rootless can run cronjobs, maybe?
or perhaps the use of user: 0:0 interferes with the mounting of the timezone and localtime volumes due permissions, I don't think so but I'm not an expert

Update: I removed the user: 0:0 and it still works the same, and with and without the user param running the next command

docker exec backup crontab -l

returns this result

# do daily/weekly/monthly maintenance
# min   hour    day     month   weekday command
*/15    *       *       *       *       run-parts /etc/periodic/15min
0       *       *       *       *       run-parts /etc/periodic/hourly
0       2       *       *       *       run-parts /etc/periodic/daily
0       3       *       *       6       run-parts /etc/periodic/weekly
0       5       1       *       *       run-parts /etc/periodic/monthly

don't know why there are 15min and hourly jobs that i didn't register

m90 commented

So the image does not use system crontab (anymore) but the gocron library and is responsible for scheduling jobs itself. I.e. if the process is running, it should schedule commands in a timely manner itself.

Your hunch about the non-root user and the timezone mounts sounds like a starting point to me. What happens if you remove these mounts and use cron expressions in UTC time?

I haven't changed the compose file, but today I checked the backup files and a new one was created automatically (the daily one for today September 24) so probably the issue here was the use of user: 0:0 I'll recheck tomorrow, and if a new backup was created (September 25) then I'll mark this issue as closed and add an update on the main description for future references.

@m90 I would just like to comment that maybe an option for more verbose logs would be nice, I'm pretty sure the issue was due permissions but the logs didn't show anything

Also it would be nice if the docs about the custom cronjobs mentioned that this uses the gocron library instead of the system cronjobs, I'm pretty sure I'm not the first one neither will be the latest to think this uses the system cronjobs.

m90 commented

Errors will always be logged, for whatever reason your setup seems to fail silently. I still think the combination of rootless and local timezones create a situation where the job is scheduled correctly, but somehow the time that would trigger it will be skipped.

May I ask which timezone your setup is in?

May I ask which timezone your setup is in?

Yeah, it is America/Mexico_City

Also, these are the updates I got today on the logs

time=2024-09-23T15:07:06.661-06:00 level=INFO msg="Successfully scheduled backup 01daily.conf with expression 0 2 * * *"
time=2024-09-23T15:07:06.661-06:00 level=INFO msg="Successfully scheduled backup 02weekly.conf with expression 0 3 * * 1"
time=2024-09-23T15:07:06.661-06:00 level=INFO msg="Successfully scheduled backup 03monthly.conf with expression 0 4 1 * *"
time=2024-09-24T02:00:00.082-06:00 level=INFO msg="Now running script on schedule 0 2 * * *"
time=2024-09-24T02:00:00.195-06:00 level=INFO msg="Stopping 6 out of 14 running container(s) as they were labeled docker-volume-backup.stop-during-backup=true."
time=2024-09-24T02:00:26.207-06:00 level=INFO msg="Created backup of `/backup` at `/tmp/daily-backup-2024-09-24T02-00-00.tar.gz`."
time=2024-09-24T02:00:31.494-06:00 level=INFO msg="Restarted 6 container(s)."
time=2024-09-24T02:00:31.748-06:00 level=INFO msg="Stored copy of backup `/tmp/daily-backup-2024-09-24T02-00-00.tar.gz` in `/archive`." storage=Local
time=2024-09-24T02:00:35.840-06:00 level=INFO msg="Uploaded a copy of backup `/tmp/daily-backup-2024-09-24T02-00-00.tar.gz` to bucket `s3.darespider.family`." storage=S3
time=2024-09-24T02:00:35.890-06:00 level=INFO msg="None of 2 existing backups were pruned." storage=Local
time=2024-09-24T02:00:35.966-06:00 level=INFO msg="None of 2 existing backups were pruned." storage=S3
time=2024-09-24T02:00:35.991-06:00 level=INFO msg="Removed tar file `/tmp/daily-backup-2024-09-24T02-00-00.tar.gz`."
m90 commented

I'm still a bit unclear about what's going on here, but maybe you can try setting CRON_TZ instead of mounting the directories as described here robfig/cron#148 and see if that combo.works?

The automatic backups are working now, the issue indeed was the user: 0:0.

I took a look on why I added it and remembered that on all the containers that come from linuxserver.io I got permission errors on mounted directories without it, for some reason I added it to this docker compose (that is not a linuxserver image) and that prevented the running of the automatic backups.

I would guess that the gcron jobs are being registered for the root user or similar when using user: 0:0 and the service doesn't trigger the schedules because of that, I don't think the process to create a backup fails because manually triggering them when this user is on the compose they work as expected, is just the cron jobs.

thanks @m90

Also the timezone volumes worked fine, the backups where created at the correct time base on my timezone.