/gosure

File integrity implemented in Go

Primary LanguageGoBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Gosure file integrity

It has been said that backups aren't useful unless you've tested them. But, how does one know that a test restore actually worked? Gosure is designed to help with this.

History

The md5sum program captures the MD5 hash of a set of files. It can also read this output and compare the hashes against the files. By capturing the hashes before the backup, and comparing them after a test restore, you can gain a bit of confidence that the contents of files is at least correct.

However, this doesn't capture the permissions and other attributes of the files. Sometimes a restore can fail for this kind of reason.

Intrusion detection

There have been several similar solutions focused on intrusion detection. Tripwire and FreeVeracity (or Veracity) come to mind. The idea is that the files are compared in place to verify that nobody has modified them.

Unfortunately, at least tripwire seems to focus so heavily on this intrusion detection problem, that the tool doesn't work very well for verifying backups. It really wants a central database, and to use files by absolute pathname. FreeVeracity was quite useful for verifying backups, however, it appears to have vanished entirely (it was under an unusual license).

Incremental updates

One thing that none of these solutions addressed was that of incremental updates, probably because of the focus on intrusion detection. In a normal running system, the POSIX ctime field can be reliably used to determine if a file has been modified. By making use of this, the integrity program can avoid recomputing hashes of files that haven't changed. This strategy is similar to what most backup software does as well. This is important, because taking the time to hash every file can make the integrity update take so long that people avoid running it. Full hashing is impractical for the same reasons that regular full backups are usually impractical.

Using gosure

Getting it

Gosure is written in Go.

There are two ways to build gosure. You can just build it, standalone:

$ git clone https://github.com/d3zd3z/gosure
$ cd gosure
$ go run build.go
$ cp gosure ~/bin

This will build a version with the release tag information embedded in the executable.

If you want to do any work on the code, it is generally best to work with Go using its idea of a workspace. You should create a directory somewhere for go work, and set the environment variable GOPATH to point to this. Once this is done, use the go tools to fetch this project:

$ go get davidb.org/x/gosure/cmd/gosure

Note

Although this project is hosted at github.com (currently), the go tool should complain if you try to fetch using that path. This is because the package needs to be able to reference sub-packages by full name, and these will only work if the package is fetched via its canonical name.

Once the tree is present:

$ go install davidb.org/x/gosure/cmd/gosure

should install the gosure program itself in $GOPATH/bin. Add this to the path to make things more convenient. The execuable is standalone, and has no dependencies on the source tree.

Basic usage

Change to a directory you wish to keep integrity for, for example, my home directory:

$ cd
$ gosure scan

This will scan the filesystem (possibly showing progress), and leave a 2sure.dat.gz (the 2sure is historical, FreeVeracity used a name starting with a 0, and having the digit makes it near the beginning of a directory listing). You can view this file if you'd like. Aside from being compressed, the format is plain ASCII (even if your filenames are not).

Then you can do:

$ gosure check

to verify the directory. This will show any differences. If you back up this file with your data, you can run gosure after a restore to check if the backup is correct.

Later, you can run

$ gosure update

which will update the sure data, adding another weave delta to the 2sure.dat.gz file. The old file will be moved to 2sure.back.gz for safety (it is not normally needed as each file will have the whole history). You can then compare the two most recent versions with:

$ gosure signoff

will compare the old scan with the current, and report on what has changed between them.

Weave Deltas

Gosure uses the weave delta format[#] to store multiple versions in a single file.

[1]The weave format was developed by Marc Rochkind as part of the SCCS revision control system. Although much of SCCS is dated, the particular way it stores all file revisions in a single “weave” file is particularly useful to the types of changes that happen to surefiles. Gosure uses the weave data format exactly as SCCS does, but uses its own header. The headers on SCCS have numerous limitations that would render it less useful, such as file sizes limited to 100,000 lines, and 2 year dates.

Each delta can have arbitrary metadata associated with it. These values can be added with --tag key=value. The name key will override the default timestamp name. It may be useful to indicate other information about when the scan was taken. The tags are arbitrary key/value pairs, although both should be restricted to printable characters.