rsync is an open source utility that provides fast incremental file transfer.
If you are into Dockerizing everything or you just want to have a better view over the rsync process (e.g., to see it in your cluster visualizer), this repository provides you with a Docker image to run the rsync process.
The provided image is open-source and built from scratch with the goal to enable you to run a stateless and an immutable container, according to the best practices.
With this image you can:
- run a simple one-time rsync job for local drives
- run a rsync job for remote drives using ssh (included in the image)
- run scheduled rsync jobs using cron (included in the image)
Supported architectures:
- the image supports multiple architectures:
x86-64
andarm32
- the docker manifest is used for multi-platform awareness
- by simply pulling or running
ogivuk/rsync:latest
, the correct image for your architecture will be retreived
Tag | Transmission Version and Architecture |
---|---|
:latest |
supports both x64 and arm32v7 architectures |
:x64 |
targeted to the x64 architecture |
:arm32v7 |
targeted to the arm32v7 architecture |
The image is based on alpine
image and it includes:
rsync
openssh-client
for remote sync over sshtzdata
for easier setting up of local timezone (for file timestamps)cron
(included in the alpine) for scheduling regular back-upsrsync.sh
script that prepares the cron job and starts the cron daemon
docker run --rm \
--name=rsync \
--volume /path/to/source/data:/data/src \
--volume /path/to/destination/data:/data/dst \
ogivuk/rsync [OPTIONS] /data/src/ /data/dst/
Replace:
/path/to/source/data
with the source folder to be copied or backed-up/path/to/destination/data
with the destination folder[OPTIONS]
with desired rsync optional arguments
-
create a folder, for example
~/rsync
, that will be later mounted in the container- this folder is intended to hold supporting files such as log files, crontab file, and any supporting rsync files (e.g., list of files to rsync)
- the subfolder
logs
, for example~/rsync/logs
is suggested for the log files - important note: for Docker Swarm, this directory needs to be available on all nodes in Docker swarm, e.g., via network shared storage
mkdir -p ~/rsync/logs
- you can replace
~/rsync
with any other desired location
-
run as a container:
docker run --rm \ --name=rsync \ --env TZ="Europe/Zurich" \ --env RSYNC_CRONTAB="crontab" \ --volume ~/rsync:/rsync --volume /path/to/source/data:/data/src \ --volume /path/to/destination/data:/data/dst \ ogivuk/rsync
-
run as a swarm service:
docker service create \ --name=rsync \ --env TZ="Europe/Zurich" \ --env RSYNC_CRONTAB="crontab" \ --mount type=bind,src=~/rsync,dst=/rsync \ --mount type=bind,src=/path/to/source/data,dst=/data/src \ --mount type=bind,src=/path/to/destination/data,dst=/data/dst \ ogivuk/rsync
Parameter | Explanation | When to Use |
---|---|---|
--env TZ="Europe/Zurich" |
Sets the timezone in the container, which is important for the correct timestamping of logs. Replace Europe/Zurich with your own timezone from the list of available timezones. |
Always |
--env RSYNC_CRONTAB="crontab" |
Specifies that the rsync is to be run as one or multiple cron jobs, and that the jobs are defined in the crontab file located in the mount-binded ~/rsync folder. The rsync parameters used in the crontab must be mindful of the data locations in the container. |
When using cron for regular rsync jobs |
~/rsync |
Specifies the local folder ~/rsync that is mounted to the container at /rsync . Change ~/rsync if another location is chosen in Step 1. |
When using cron or ssh |
/path/to/source/data |
Specifies the source folder for sync and is mounted to the container at /data/src . Change to the appropriate folder. Multiple folders can be mounted in this way. |
If any source is local |
/path/to/destination/data |
Specifies the destination folder for sync and is mounted to the container at /data/src . Change to the appropriate folder. Multiple folders can be mounted in this way. |
If any destination is local |
--env RSYNC_UID=$(id -u) |
Provides the UID of the user starting the container so that the ownership of the files that rsync copies belong to that user. | If the rsync option for preserving ownership is not selected |
--env RSYNC_GID=$(id -g) |
Provides the GID of the user starting the container so that the ownership of the files that rsync copies belong to that user group. | If the rsync option for preserving ownership is not selected |
Remarks:
- rsync will not be run by default, you need to be specify the rsync command with all its arguments in the crontab, or in a script called in the crontab
- any later changes to the crontab file require the service to be restarted, and that's why consider to define the rsync job in a script that is called in the crontab
- when defining the rsync arguments, including source and destination, do that from the perspective of the container (/data/src, /data/dst)
- more volumes can be mount binded if needed
- the ssh client is included in the image in case your source or destination is a remote host
- ssh required files (private key, known_hosts, ssh_config) needs to be stored in a folder mounted to the container, for example in
~/rsync/ssh/
- you can define the ssh connection in a
ssh_config
file - rsync option
-e "ssh -F /rsync/ssh/ssh_config"
instructs rsync to use the ssh with thessh_config
for the remote sync
- ssh required files (private key, known_hosts, ssh_config) needs to be stored in a folder mounted to the container, for example in