/rsync-rotate-backup

A python script to do rotation backups using rsync. Just like the Time Machine in OSX.

Primary LanguagePythonMIT LicenseMIT

rsync-rotate-backup

A python script to do rotation backups using rsync. Just like the Time Machine in OSX.

Requirements

  • rsync
  • a file system that suport hard link, e.g.
    • ext4, zfs and other file systems used by unix-like OS are supported
    • NTFS, FAT32 and exFAT are not supported!

Parameters

## default rsync-rotate-backup config file

# you must have installed rsync, this is the binary path
RSYNC: /usr/bin/rsync

# the backup source path, should be absolute path and must be a folder and ends with /
# can also be a remote path, like `remote-host.com:/root/`
src: <src> # you should modify this

# when your src is a remote path
#   add -e "ssh -p <port> -i <private_key>" into rsync command
port:
private_key:

# if ture, will not generate history snapshots
#   the script will just make rsync with --delete options
no-history: false # default is False, we will have many backup folders like `YYYY-MM-DDThh:mm:ss`

# delete the content in `--backup-dir` path
#   if you want to leave the diff files for view or debug, set it to false
delete-diff: true

# Each time, the script will list the old backups that should be deleted in log file.
#   if this is set to True, really delete these old backups
do-clean: False

## time units for below configs
# s: second
# m: minute
# h: hour
# d: day
# w: week
# M: month
# y: year

autoclean-min-interval: true # autoclean use min-interval filter
interval: 1m                 # min interval of two backups

autoclean-max-age: true # autoclean use max-age filter
max-age: 10y             # max age of a backup

autoclean-multi-level: true # autoclean use multi-level filter
# The multi-level filter
#   we generate time intervals based on the `max-level`, `interval-level` and `start-level` configs
#   e.g, max-hour: 18, interval-hour: 2, start-minute:0, start-second: 0
# if now is 20:20:10, we will generate these time intervals
#   19:00:00 ~ 21:00:00 # the latest interval that contains current time, minute and second are `start-minute` and `start-second`
#   17:00:00 ~ 19:00:00
#   15:00:00 ~ 17:00:00
#   13:00:00 ~ 15:00:00
#    9:00:00 ~ 11:00:00
#    7:00:00 ~  9:00:00
#    5:00:00 ~  7:00:00
#    3:00:00 ~  5:00:00 # the last interval that NOT contain 2:10:10 (now - max_hour)
# in each interval, we will reserve the newest backup (if any) and delete other backups

## below configs are for multi-level filter
# how long will backup stay at each level. if 0, will stay forever
max-year: 5
max-month: 12
max-week: 6
max-day: 7
max-hour: 24
max-minute: 60
max-second: 60

# the interval in each level. if 0, do not delete old backups at this level. Should be less than max-level.
interval-year: 1    # all years
interval-month: 1   # in same year
interval-week: 1
interval-day: 1     # in same month
interval-hour: 1    # in same day
interval-minute: 10 # in same hour
interval-second: 0  # in same minute

# the start value of each level
start-month:     1 # 1~12
start-day-month: 1 # 1~28
start-day-week:  1 # 1~7
start-hour:      0 # 0~59
start-minute:    0 # 0~59
start-second:    0 # 0~59'''[1:])

see the comments in the above default config file

Example Usage

$ rsync-rotate-backup init <dest> --src <src> # init backup dest file and set `src` option in it
$ vim <dest>/config.yml                       # modify configs if needed
$ rsync-rotate-backup backup-to <dest>        # init backup dest file and set `src` option in it
$ rsync-rotate-backup clean <dest>            # manual clean old backups based on 3 different filters
                                              #   you can add `--do-clean` in the above `backup-to` command
# add `rsync-rotate-backup backup-to <dest> --do-clean` to `/etc/cron.d/rsync-backup` with some intervals

Behaviors

  • After the first time you run the backup-to command, you will get
    • /mount/backup/YYYY-mm-ddTHH:MM:SS
    • /mount/backup/current -> /mount/backup/YYYY-mm-ddTHH:MM:SS
      • this links are hard links
  • After many backup-to commands, you will get
    • /mount/backup/YYYY-mm-ddTHH:MM:SS
    • /mount/backup/YYYY-mm-ddTHH:MM:SS
    • /mount/backup/YYYY-mm-ddTHH:MM:SS
    • /mount/backup/YYYY-mm-ddTHH:MM:SS
    • /mount/backup/current -> /mount/backup/YYYY-mm-ddTHH:MM:SS # to the latest version
  • these backups are incremental backups
  • You can get all backup histories in dest/log.log
    • we will list all the commands we execute in each backup and you can see how we implement the backups
  • You can get detailed rsync log in each /mount/backup/YYYY-mm-ddTHH:MM:SS/rsync-rotate-backup.log

Criterions to clean old backups

The script use 3 criterions to clean old backups

  1. min interval:
  • if two backups are too close, will delete the old one
  • default value: interval: 1m
  1. max age:
  • if the backups are too old, delete them
  • default value: max-age: 10y
  1. a multi-level criterions
  • levels are: year, month, week, day, hour, minute, second
  • each level have three related parameters:
    • max-xx: the max time for backups to stay on this level
    • interval-xx: the interval time of this level
    • start-xx: the start tiem of this time
  • e.g.: max-hour: 18, interval-hour: 2, start-minute: 0, start-second: 0
    • if now is 20:10:10, we will get several intervals (minute and second start at 00:00 of each hour, max-age is 18 hour and interval is 2 hour):
      • 19:00:00 ~ 21:00:00 # the latest interval that contains current time
      • 17:00:00 ~ 19:00:00
      • 15:00:00 ~ 17:00:00
      • 13:00:00 ~ 15:00:00
      • 09:00:00 ~ 11:00:00
      • 07:00:00 ~ 09:00:00
      • 05:00:00 ~ 07:00:00
      • 03:00:00 ~ 05:00:00 # the last interval that NOT contain 2:10:10 (now - max_hour)
    • in each interval, we only resive the latest backup and delete others, if they exist.

More examples

Example 1: backup /home/my to /mount/backup/my-home

# init the backup container
rsync-rotate-backup init /mount/backup/my-home --src=/home/my
vim backupExclude.conf # add your own exclude positions
# backupExclude.config
lost+found
.cache
.gvfs
.mozilla/firefox/*/Cache
.cache/chromium
.thumbnails
.npm

After these, run rsync-rotate-backup backup-to /mount/backup/my-home --do-clean and add it to cron with a frequence of 10 minute

After 3 years, we will get these backups

  • 6 backups in minute level:

    • 10 minute as the interval, for last 60 minutes
  • 24 backups in hour level:

    • 1 hour as the interval, for last 24 hours
  • 7 backups in day level:

    • 1 day as the interval, for last 7 days
  • 6 backups in week level:

    • 1 week as the interval, for last 6 weeks
  • 12 backups in month level:

    • 1 month as the interval, for last 12 months
  • 3 backups in year level:

    • 1 year as the interval, for last 3 yeas
  • all the history backups is like YYYY-mm-ddTHH:MM:SS

    • you can treat each backup fold as a snapshot at that time
  • with the help of hard link, we are doing incremental backups, which save a lot of space

Example 2: backup / to /mount/backup/my-laptop

# init the backup container
rsync-rotate-backup init /mount/backup/my-laptop --src=/
vim backupExclude.conf # add your own exclude positions
# backupExclude.config
home/*/.cache
root/.cache
home/*/.gvfs
home/*/.mozilla/firefox/*/Cache
home/*/.cache/chromium
home/*/.thumbnails
home/*/.npm
var/tmp
var/cache
proc
sys
dev
run
tmp
media
mnt
# !! remember to exclude your backup mount path
/mount/backup # mount point of our backup disk

Example 3: backup several different folders

  • set src to /
# backupExclude.config
# to include a/b/c, you must add `a/`, `a/b`, `a/b/c`, `a/b/c/**`
+ etc/
+ etc/**
+ home/
+ home/someone/
+ home/someone/**
- home/another-one/.cache
+ home/another-one/
+ home/another-one/**
- **

Example 3 extras: use a safe login shell for root

  • copy utils/backup-shell into /root/.ssh
  • add extra command="/root/.ssh/backup-shell" before your backup pubkey in /root/.ssh/authorized_keys, like
# /root/.ssh/authorized_keys
command="/root/.ssh/backup-shell" ssh-rsa AAAAB3NzaC1yc2E....

this login-shell will limit your backup server to only able to do rsync backups and nothing else

Demo and unittest

go to tests, run make test, then see the /tmp/rsync-rotate-backup/example0/dest/log.log to have a overview how our system work

Demo log example

our script will generate a clean log to show what rsync commands we call and what snapshot we will clean, here is some example log from the above unittest result

=== 2024-03-13T12:30:41 ===
${HOME}/.local/bin/rsync-rotate-backup backup-to /tmp/rsync-rotate-backup/example0/dest
  start backup
    /tmp/rsync-rotate-backup/example0/src/
    ==>
    /tmp/rsync-rotate-backup/example0/dest/current/
  do rsync link from current to current-working
    $ /usr/bin/rsync -a --link-dest=/tmp/rsync-rotate-backup/example0/dest/current/ /tmp/rsync-rotate-backup/example0/dest/current/ /tmp/rsync-rotate-backup/example0/dest/current-working/
  do rsync into `currentWorking`
    $ /usr/bin/rsync --stats --delete --backup --backup-dir=../diffs/2024-03-13T12:30:41 --exclude-from=/tmp/rsync-rotate-backup/example0/dest/backupExclude.conf -alP /tmp/rsync-rotate-backup/example0/src/ /tmp/rsync-rotate-backup/example0/dest/current-working/
  do rsync link from `current-working` to `datetime` folder
    $ /usr/bin/rsync -a --link-dest=/tmp/rsync-rotate-backup/example0/dest/current-working-done/ /tmp/rsync-rotate-backup/example0/dest/current-working-done/ /tmp/rsync-rotate-backup/example0/dest/2024-03-13T12:30:41
  delete current dir
  mv currentWorking to current
  rsync statistic:
    Number of files: 4 (reg: 3, dir: 1)
    Number of created files: 0
    Number of deleted files: 2 (reg: 2)
    Number of regular files transferred: 1
    Total file size: 176 bytes
    Total transferred file size: 176 bytes
    Literal data: 176 bytes
    Matched data: 0 bytes
    File list size: 0
    File list generation time: 0.001 seconds
    File list transfer time: 0.000 seconds
    Total bytes sent: 367
    Total bytes received: 110
    sent 367 bytes  received 110 bytes  954.00 bytes/sec
    total size is 176  speedup is 0.37
  # extra comments: we turnoff the min-interval filter in the demo
  autoclean find 5/15 files with filters: ['max-age', 'multi-level'] to be cleaned
  auto clean status:
    delete 0 by max-age 10y older than 2014-03-13T12:30:41
    # extra comments: see how we select backups to be reserved and to be deleted
    delete 5 by level-filter
      second     interval:5 max: 60 delete:5 use:10 (13) 2024-03-13T12:29:40~2024-03-13T12:30:45
        [ 1] 2024-03-13T12:30:40 ~ 2024-03-13T12:30:45 deleting:  0 | use: 2024-03-13T12:30:41
        [ 1] 2024-03-13T12:30:35 ~ 2024-03-13T12:30:40 deleting:  0 | use: 2024-03-13T12:30:36
        [ 1] 2024-03-13T12:30:30 ~ 2024-03-13T12:30:35 deleting:  0 | use: 2024-03-13T12:30:32
        [ 2] 2024-03-13T12:30:20 ~ 2024-03-13T12:30:30 deleting:  0 | use: 2024-03-13T12:30:30
        [ 1] 2024-03-13T12:30:15 ~ 2024-03-13T12:30:20 deleting:  0 | use: 2024-03-13T12:30:17
        [ 2] 2024-03-13T12:30:05 ~ 2024-03-13T12:30:15 deleting:  1 | use: 2024-03-13T12:30:15
          /tmp/rsync-rotate-backup/example0/dest/2024-03-13T12:30:12
        [ 1] 2024-03-13T12:30:00 ~ 2024-03-13T12:30:05 deleting:  0 | use: 2024-03-13T12:30:04
        [ 1] 2024-03-13T12:29:55 ~ 2024-03-13T12:30:00 deleting:  1 | use: 2024-03-13T12:29:58
          /tmp/rsync-rotate-backup/example0/dest/2024-03-13T12:29:57
        [ 1] 2024-03-13T12:29:50 ~ 2024-03-13T12:29:55 deleting:  1 | use: 2024-03-13T12:29:55
          /tmp/rsync-rotate-backup/example0/dest/2024-03-13T12:29:52
        [ 2] 2024-03-13T12:29:40 ~ 2024-03-13T12:29:50 deleting:  2 | use: 2024-03-13T12:29:50
          /tmp/rsync-rotate-backup/example0/dest/2024-03-13T12:29:48
          /tmp/rsync-rotate-backup/example0/dest/2024-03-13T12:29:46
      minute     interval:1 max: 60 delete:0 use:1 (61) 2024-03-13T11:29:00~2024-03-13T12:30:00
        [61] 2024-03-13T11:29:00 ~ 2024-03-13T12:30:00 deleting:  0 | use: 2024-03-13T12:29:50
      hour       interval:3 max: 24 delete:0 use:1 (9) 2024-03-12T12:00:00~2024-03-13T15:00:00
        [ 9] 2024-03-12T12:00:00 ~ 2024-03-13T15:00:00 deleting:  0 | use: 2024-03-13T12:29:50
      day        interval:1 max:  7 delete:0 use:1 (8) 2024-03-06T00:00:00~2024-03-14T00:00:00
        [ 8] 2024-03-06T00:00:00 ~ 2024-03-14T00:00:00 deleting:  0 | use: 2024-03-13T12:29:50
      week       interval:1 max:  6 delete:0 use:1 (7) 2024-01-29T00:00:00~2024-03-18T00:00:00
        [ 7] 2024-01-29T00:00:00 ~ 2024-03-18T00:00:00 deleting:  0 | use: 2024-03-13T12:29:50
      month      interval:1 max: 12 delete:0 use:1 (13) 2023-03-01T00:00:00~2024-04-01T00:00:00
        [13] 2023-03-01T00:00:00 ~ 2024-04-01T00:00:00 deleting:  0 | use: 2024-03-13T12:29:50
      year       interval:1 max:  5 delete:0 use:1 (6) 2019-01-01T00:00:00~2025-01-01T00:00:00
        [ 6] 2019-01-01T00:00:00 ~ 2025-01-01T00:00:00 deleting:  0 | use: 2024-03-13T12:29:50
  • the first part of the log shows the procedures of our backup
  • the last part shows the auto clean status of our backups
    • we just use max-age and multi-level filter
      • max-age found nothing to delete (older than 3y ago)
      • we make many bakcups in the demo, we set interval-second to 5 and max-second to 60
        • we can see how the time intervals in set in the second level and what we reserve and delete in each time interval

install

pip install .

upgrade notice

I have mostly rewrite the program in 2024 (last update is before 2020). The previous version is deprecated, you should regenerate and modify the new default config.yml