/objectlock

Apply object locks to a B2 bucket

Primary LanguagePythonMIT LicenseMIT

objectlock

Apply object locks to all objects in a B2 bucket, and refresh their retention time.

This allows to use the ransomware protection features of the B2 Object Locks even without support from the backup tool. So you can use rclone, kopia or other tools that do not support it yet, and still enjoy the protection.

It was inspired by this feature request on kopia: kopia/kopia#1067

first version

This initial implementation uses the B2 Command Line Tool and needs you to hardcode all the parameters.

It should be possible to do pretty much the same thing with AWS S3 using its CLI

Usage

  • install B2 Command Line Tool
  • create an AppKey with writeFileRetentions,readFileRetentions permissions - the AppKeys created from the WebUI do not have those b2 create-key --bucket <bucket_name> <key_name> deleteFiles,listAllBucketNames,listBuckets,listFiles,readBucketEncryption,readBucketReplications,readBuckets,readFiles,shareFiles,writeBucketEncryption,writeBucketReplications,writeFiles,writeFileRetentions,readFileRetentions
  • authenticate using b2 authorize-account --profile <profilename> <keyID>
  • run ./b2_objectlock_all.py --profile <profilename> <bucket> <path> to check
  • run with --update to apply the corrections
  • adjust the B2 bucket
    • Lifecycle Settings to "Keep prior versions for this number of days: N" with N=lockdays+1
    • For testing, I think it's better not to set a default object lock mode and retention. For production it may be appropriate to set a default retention larger than the interval between runs of b2_objectlock_all
host:objectlock$ ./b2_objectlock_all.py --help
usage: b2_objectlock_all.py [-h] --profile PROFILE [--fileagemax FILEAGEMAX] [--lockmode {governance}] [--lockdays LOCKDAYS] [--update] bucket path

check or update B2 Object Locks

positional arguments:
  bucket
  path

options:
  -h, --help            show this help message and exit
  --profile PROFILE     B2 profile
  --fileagemax FILEAGEMAX
                        max age of ls cache file in seconds, default 600
  --lockmode {governance}
                        Lock mode
  --lockdays LOCKDAYS   Lock for days
  --update              Update lock mode & days

known bugs

The ls cache file uses only the bucket name, not the path. If you change path, the script will still process the last cached file list. Force a refresh by deleting the cache files *.ls.json.

usage with rclone

Currently I'm testing this code with a small size backup made with rclone.

Make sure your rclone remote config has hard_delete = false (use rclone config show). That will make rclone use "hide" on the object, and lets the Lifecycle Settings take care of actual deletion.

With the default hard_delete = true rclone will try to really delete the file, and fail, and b2_objectlock_all will continue to update its retention.

usage with kopia

I have not yet started testing with Kopia. It may need excluding some files like logs?

Testing

Testing B2, step 1

In the first round of tests, I'm not trying to bypass governance mode. I expect this will match what I would see if I was using compliance mode already.

delete files from rclone sync

rclone sync will legitimately delete files which are no longer in the source. Needs hard_delete = false for it to work with Object Lock. This means that it needs to leave the actual deletion to B2 bucket Lifecycle Settings.

delete file manually via WebUI

The "hidden" file version left by rclone deletion does not get a lock set by b2_objectlock_all. This means that the "hidden" version can be deleted, and the non-hidden version will reappear, its lock may get refreshed too. Anyway rclone sync should soft-delete it again at the next run.

At worst, this will cause some extra files to reappear in a restore, if it happens before a new sync; in general it should not cause data loss.

For the files / versions which are locked, the B2 WebUI will skip them, nothing is deleted and no data is lost.

delete files with rclone purge

tbd

may not lose data but could duplicate each object again, if a sync is done without first resurrecting the old versions? This could result in a pretty high storage cost for the time of retention.

delete files using full privilege

Eg with the Master Application Key