Analyze differences between two BTRFS snapshots (like GNU diff for directories).
It is a single GO script (package) of ~ 1000 lines of code (without blanks and comments), plus a main script (binary) of 190 loc.
This is the output of btrfs-diff --help
:
btrfs-diff-go - Analyse the differences between two related btrfs subvolumes.
USAGE
btrfs-diff-go [OPTIONS] PARENT CHILD
Analyse the difference between btrfs PARENT and CHILD.
btrfs-diff-go [OPTIONS] -f|--file STREAM
Analyse the differences from a STREAM file (output from 'btrfs send').
btrfs-diff-go [ -h | --help ]
Display help.
ARGUMENTS
PARENT
A btrfs subvolume that is the parent of the CHILD one.
CHILD
A btrfs subvolume that is the child of the PARENT one.
OPTIONS
-h | --help
Display help.
-i | --info
Be verbose.
-d | --debug
Be more verbose.
-f | --file STREAM
Use a STREAM file to get the btrfs operations.
This stream file must have been generated by the command
'btrfs send' (with or without the option --no-data).
-t[changed] | --with-times[=changed]
By defautl time modifications are ignored. With that option
they will be taken into account. They are labelled as 'times'
but if you also specify '=changed' they will be labelled
'changed'.
-p[changed] | --with-perms[=changed]
By defautl permission modifications are ignored. With that option
they will be taken into account. They are labelled as 'perms'
but if you also specify '=changed' they will be labelled
'changed'.
-o[changed] | --with-own[=changed]
By defautl ownership modifications are ignored. With that option
they will be taken into account. They are labelled as 'own'
but if you also specify '=changed' they will be labelled
'changed'.
-a[changed] | --with-attr[=changed]
By defautl attribute modifications are ignored. With that option
they will be taken into account. They are labelled as 'attr'
but if you also specify '=changed' they will be labelled
'changed'.
EXAMPLES
Get the differences between two snapshots.
$ btrfs-diff-go /backup/btrfs-sp/rootfs/2020-12-25_22h00m00.shutdown.safe \
/backup/btrfs-sp/rootfs/2019-12-25_21h00m00.shutdown.safe
AUTHORS
Originally written by: David Buckley
Extended, fixed, and maintained by: Michael Bideau
REPORTING BUGS
Report bugs to: <https://github.com/mbideau/btrfs-diff-go/issues>
COPYRIGHT
Copyright © 2020-2021 Michael Bideau.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Info: original license chosen by David Buckley was MIT, but it allows sublicensing, so I
chose to sublicense it to GPLv3+ to ensure code sharing
SEE ALSO
Home page: <https://github.com/mbideau/btrfs-diff-go>
First, install the required dependencies (example for Debian / Ubuntu)
~> sudo apt install golang libbtrfs-dev
Use the convenient go install
:
~> go install github.com/mbideau/btrfs-diff-go
That will create a binary named btrfs-diff-go in $GOPATH/bin
.
Clone the repository, run the build then install it
~> git clone -q https://github.com/mbideau/btrfs-diff-go.git
~> cd btrfs-diff-go
~> go build -v
~> go install
And rename it to btrfs-diff
, if you don't care about the implementation language.
~> [ "$GOPATH" != '' ] || GOPATH="$HOME/go"
~> sudo cp $GOPATH/bin/btrfs-diff-go /usr/local/bin/btrfs-diff
~> sudo chmod +x /usr/local/bin/btrfs-diff
The great advantage of having a COW filesystem with snapshoting like BTRFS is that producing the differences between two snapshots is almost instantaneous.
For example, you can get the differences between snap1 and snap2 with the following command :
~> sudo btrfs send --quiet --no-data -p snap1 snap2 | LC_ALL=C btrfs receive --quiet --dump > /tmp/btrfs.dump
Note that this dump is not really human readable. Moreover it contains operations, not differences. So it is not exactly what we are looking for. For example it might contains transient object informations, and multiple lines of unintuitive operations to reproduce a file state.
I wanted a differences file format like the one you have when doing diff -rq
or
git status --goort
, in short: a human friendly one.
I looked at the prior art (see below), but nothing were satisfying enough, so I rolled my own
diff utility (which produce the stream with btrfs send
and then parse it).
As the time of writing this (i.e.: Dec. 2020), I have found 2 projects matching btrfs diff
in Github and 0 in Gitlab.
-
btrfs-send-go [GO]
The one that this project have extended, fixed and improved.
The original author's version is raw, and have minor bugs, but does exactly the job.
It is also not translatable (as-is). -
btrfs-snapshots-diff [Python 2]
It has a lot of issues (with link, but not only), and is Python 2, which is deprecated by now.
No go. -
btrfs-snapshots-diff [Python 3]
A fork of the previous one, with a lot of issues fixed and in Python 3.
Because it is written in Python, it means that if I want to run it in initram (I do) I will need to include the Python binary and the required dependencies. Too much for what I want.
May be I could compile it with Cython, but I am not (yet) comfortable with that.
There is also the snapper utility that compares BTRFS
snapshots, but it does so by mounting both snapshots and doing a "standard" diff
on them (if my
understanding is correct).
Finally I have found a lot of small Python script doing a BTRFS diff, but they were using a
hacky way to do it (based on the find-new
method), without being able to catch deletions.
They were better-than-nothing prior to btrfs send
and btrfs receive
, but they are obsolete
since. Hence, I skipped all those.
So, I almost found what I wanted, after patching/fixing btrfs-send-go but I was not confident enough to trust it, and it still lacked the translation layer.
This is why I first decided to roll my own script, in POSIX shell, that have all the features I was looking for. See btrfs-diff-sh.
But after being happy with it, it was a little bit too slow, so I decided to go back to the go version (pun intented) and fix its bug and improve its user experience.
Here it is. Way faster than the shell version. Not measured yet (it's a feeling).
Cool features implemented :
- can produce the raw diff from two snapshots or parse a raw stream file
- produces an output close to the
diff -rq
utility andgit status --goort
- fast
It does the job, but have some limits.
It was not tested on huge dumps, so it might not perform well or reveal majors bugs.
Due to BTRFS implementation, some files appear as changed, when they are not (according to
diff
utility). I have absolutely no idea why BTRFS is acting like this… If someone can
help me figures this out, I'll be glad.
If you have any question or wants to share your uncovered case, please I be glad to answer and accept changes through Pull Request.
Do your changes, then, in the source directory, run :
~> go build -v
The program follows this process :
- produce (with
btrfs send
syscall) or get a BRTFS file stream (from CLI arg) - parse this file in a binary mode
- extract commands and their parameters (should match the line of
btrfs receive --dump
)- those commands are mapped with operation (i.e.: command 'delete' => operation 'delete')
- commands are associated with paths, mostly only one, and two for rename operation
- foreach command's path, we re-created the file tree with an object called 'node'
- we maintain two trees: one for new files, one for old files
- new files can have an original one, and old file can have a new version
- after having prcessed all the commands, we flatten and analize the tree
- foreach old file we produce the resulting change, then the same for each new file
And to be sure that the program is working in your environment, or that you have not broken anything while developing, you have to run the tests.
Then you can run the following command:
~> sh test.sh
It is very raw but it test already a lot of cases.
By order of priority :
- create a screencast to show of the program
- make the program translatable
- create a Github action to automaticaly insert the help into the README from the exectuion of the command
- create a Github action to generate a table of links to the sections at the top of the README
- create an alpha release when the program would have received enough testing (and possibly real life conditions runs)
Originally written by: David Buckley
Extended, fixed, and maintained by: Michael Bideau
With a lot of thanks to :
- Mek101: very good crash-tester 😉
Copyright © 2020-2021 Michael Bideau [France]
All the btrfs-diff-go source codes (every file but README.md, CODE_OF_CONDUCT.md and LICENSE) are licensed under the GPLv3+ license.
btrfs-diff-go is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
btrfs-diff-go is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with btrfs-diff-go. If not, see https://www.gnu.org/licenses/.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Copyright © 2020-2021 Michael Bideau, France
This document is licensed under a
Creative Commons Attribution 4.0 International License.
Michael Bideau, France
I started with formiko, then used vim with linters to help catching mistakes and badly written sentences: