/google-drive-backup

Scheduled backup of Google Drive to AWS S3 bucket using rclone sync

Primary LanguageTypeScriptMIT LicenseMIT

Google Drive Backup

Requirements

  • node.js & npm
  • Docker CLI for building & pushing images
  • AWS CLI for initial setup (optional)

Building Docker images

If the docker image in ECR is out-of-date, the cdk deploy command will build a new image using the Docker CLI and push it to ECR for use in the ECS task. This image must be built with the amd64 architecture. If your local machine is an M1 Mac, by default the Docker CLI will build images with the arm64 architecture. Since the docker CLI commands are generated by the AWS CDK code we can't easily use the --platform option to change the architecture of the image. However, setting the DOCKER_DEFAULT_PLATFORM environment variable to linux/amd64 in the shell from which you run cdk deploy will have the desired effect.

Significant files

Schedules an ECS Fargate Task to execute a backup script within a Docker container. The backup script runs the rclone sync command with a Google Drive directory as the source and an AWS S3 bucket as the target.

├── lib
│   └── google-drive-backup-stack.ts      # contains ECS Task definition
└── local-image
    ├── Dockerfile                        # defines docker image for ECS Task
    └── home
        ├── backup.sh                     # script executed by ECS Task
        └── rclone.conf                   # includes config for AWS S3

Configuration

Specify values for the following environment variables in the .env file:

  • HEALTHCHECKS_URL - ping URL for healthchecks.io check
  • GOOGLE_DRIVE_IMPERSONATION_EMAIL - email address to use with rclone --drive-impersonate option
  • GOOGLE_DRIVE_FOLDER - source folder path
  • RCLONE_S3_REGION - AWS region in which cdk deploy was run and thus S3 bucket was created
  • CRON_SCHEDULE - JSON representation of JavaScript object conforming to CronOptions interface, e.g. {"weekDay":"mon","hour":"03","minute":"15"}

Healthchecks

  • Create a healthchecks.io account, create a project, and add a check with a suitable period and grace time to ensure the task completes successfully according to the schedule defined in CRON_SCHEDULE.

  • Add suitable integrations to the check to provide relevant notifications, e.g. email, Slack, etc.

  • Set HEALTHCHECKS_URL to the "ping URL" for the check, this will be of the form: https://hc-ping.com/${uuid}.

Google Drive

$ aws secretsmanager put-secret-value \
  --secret-id /google-drive-backup/RCLONE_DRIVE_SERVICE_ACCOUNT_CREDENTIALS \
  --secret-string `cat google-drive-credentials.json`
  • You can delete the temporary file, google-drive-credentials.json, after you've done this, but it might be worth keeping a record of the credentials somewhere secure.

  • The RCLONE_DRIVE_SERVICE_ACCOUNT_CREDENTIALS environment variable specifies the credentials for rclone to use for Google Drive (see this documentation for details).

AWS S3

  • This is setup automatically when running cdk deploy to generate the stack. The rclone env_auth config setting is set to true so that rclone uses the IAM role assigned to the ECS Task - see this section of the documentation.

Useful commands

Note that cdk commands should be run with credentials for an AWS IAM user that has wide ranging permissions to use CloudFormation to create/update/destroy AWS resources.

  • npm run build compile typescript to js
  • npm run watch watch for changes and compile
  • npm run test perform the jest unit tests
  • cdk deploy deploy this stack to your default AWS account/region
  • cdk diff compare deployed stack with current state
  • cdk synth emits the synthesized CloudFormation template