BetterSign
Todo
- Initialize the repo and set up initial structure.
- Add the LICENSE file.
- Rough draft of the README.md.
- Create the initial BetterSign lib and executable driver.
- Add error_chain crate to simplify the custom error types.
- Switch to using background-jobs crate to handle the job queueing.
- Add support for r/w of Linked Data Signature format signatures.
- Add support for r/w of JWT/JWS format signatures.
- Define the status keywords and parameters for Git interface.
Introduction
BetterSign (bs
) is a new signing tool designed to streamline the generation
and verification of signed manifests for files and data. The goal is to create
a better code and release signing tool that integrates seamlessly with Git and
provides real value to digitally signing commits through a new strategy for key
management.
There are many problems with Git's reliance on GPG for its sole signing tool. The problem is that most people who clone a repo do not have all of the public keys of the commit signers nor do they want to spend the time it takes to manually download the public keys from key servers. Even if find a way around the difficult task of generating a list of key IDs from the repo and they download the public keys, they can't necessarily trust that the keys are the real keys used by the commit signers.
The solution to this problem is to store the "keyring" of public keys in the Git repo itself and to track the keys as you would any source code file in the repo. This gives a repo the ability to self-validate the signatures in the repo log. It also allows for the revocation of keys through simply deleting the key from the repo. With simple Git hooks, rules such as "all commits must be signed by a key in the repo" are trivial to implement and help enforce the provenance regime.
So why a new signing tool? Why not just use GPG? We could use GPG keyrings stored in the repo, however GPG keyrings are binary blobs and key management with a tracked repo (e.g. adding a new key) would always result in a new copy of the keyring and it would be difficult to inspect. The BetterSign tool is design to understand decentralized identifier documents (DID docs) that contain the key material and any other information associated with a contributor identity. Typically, DID docs are formatted in JSON and are therefore both human and machine readable and easy to manage as tracked files in a repo.
BetterSign uses a simple organization of DID documents in a Git repo that is inspired by the Maildir standard. The unimaginative name is the DIDdir format and it is specified in the W3C DID Git Method specification. The DID Git method specification also defines the standard way to reference a given Git repo and a specific identity stored within it called a decentralized identifier (DID). In the case of identities stored in a Git repo, the DID begins with "did:git". You will pass DID's to BetterSign as the arguments specifying the identities for the different operations (e.g. "sign", "verify").
BetterSign is designed to be used for managing identities and key material for contributors to open source projects but it is also useful as a generic signing tool as well. By default BetterSign creates a DIDdir Git repo in your home folder similar to the way GPG does. That is your personal keyring and the key material in there may contain secrets like your private keys just like a GPG keyring does.
User Interface
BetterSign is implemented as a command line tool called bs
. The interface is
rather simple and follows the pattern of: bs <subcommand> [options]
. In all
cases, BetterSign uses the DIDdir in the user's home folder unless the
--keyring
command line option is used. It also uses the identity in the DID
document with the "default" alias unless the --did
option is used with a
valid DID. See the DID Git Method specification linked to above for details on
identity aliases.
BetterSign also supports outputting machine parseable status on a given file
descriptor specified by the --status-fd
command line option. The format of
the output is similar to what GPG outputs. Each line begins with "[BS:] "
followed by a status keyword (e.g. DID_CONSIDERED, etc) followed by the
parameters for the keywork, if any.
Sign
The sign
subcommand generates a detached digital signature over the given
file(s) or data piped over stdin. To generate a signature over one or more
files, execute bs
like so:
$ bs sign [options] [<file> ...]
To generate a signature for data piped to BetterSign over stdin, use -
instead of the file name(s). In addition to the --keyring
and --did
options, the sign
subcommand also supports a --format
subcommand for
specifying the format of the resulting signature. The supported values are
lds
for the Linked Data Signature
(LDS) format and jwt
for the
JSON Web Token (JWT) format.
BetterSign uses the EdDSA algorithm when signing data. It is the combination of the SHA-512 digest algorithm and the Edwards curve encryption algorithm. When the output is in LDS format, the signature type value is "Ed25519Signature2018". When the output is in JWT format, BetterSign uses the non-standard "alg" param value of "ED512" meaning EdDSA. The current standard set of "alg" values is defined in the JSON Web Algorithms RFC 7518 §3.1 and does not contain any Ed25519 based signature schemes so BetterSign is doing...uh...better.
When used with Git to sign commits, Git will pipe the data to be signed over stdin to BetterSign. The resulting signature, by default, is in LDS format. When signing files and the output format is LDS format BetterSign outputs a a linked data signature file with a non-standard "files" attribute inside of the proof and part of the data that is signed. The "files" attribute is an array of JSON objects with a single key consisting of the file name and the value being the digest value of the file like so:
{
"@context": "https://w3id.org/identity/v1",
"proof": {
"type": "Ed25519Signature2018",
"creator": "did:git:...",
"created": "1970-01-01T00:00:00Z",
"nonce": "...",
"proofValue": "...",
"files": [
{ "foo.txt": "..." },
{ "bar.txt", "..." },
{ "baz.txt", "..." }
]
}
}
This serves as a manifest file for the authentication of the files included in
the signature.
## Verify
The `verify` subcommand takes a signature file in either LDS or JWT format and
attempts to verify the data signed. If the signture file does not contain a
"files" attribute in the "proof" it is assumed that the data that was signed
will be piped over stdin.
If the the signature file does contain a "files" attribute in the "proof" then
the listed files will be found and run through the digest algorithm and checked
to see if they have been modified or not before checking that the signature is
valid.
An important detail to point out is that the identities in the DIDdir will
change over time and the historical context is necessary to be able to find the
correct DID document and key material needed to verify the signature. For that
BetterSign relies on the position in the repo history to derive the correct
state of the DIDdir before resolving the "creator" DID into the correct DID
document and extracting the key material for the the signature verification.
BetterSign does not need to worry about this detail as it is handle by the
DIDdir library, but it something to be aware of to get your mental model
correct.
## Notes on Git
The current Git commit signing system is hard coded to use GPG/GPGSM and
supports both PGP identities with GPG and x.509 identities with GPGSM. Git
relies entirely on external tools to sign and verify data. Git only handles
storing/extracting the signatures in/from the commit meta data and passing it
to GPG.
BetterSign has an interface that is somewhat similar to GPG to make the
modifications to Git simpler. BetterSign supports specifying a <key-id>
### Signature Creation
The important part here is the way Git interfaces with external tools. When it
shells out to GPG to create a signature it pipes the content to be signed to
GPG over stdin and reads back the signature over stdout and the machine
parseable status over stderr (e.g. fd = 2).
#### Commits
0. If the sign commit flag is passed to Git (e.g. `git commit -S`) the
do_sign_commit function (commit.c around line 937) which drives the whole
process.
1. If the key-id is not passed to the command line (e.g. `git commit
--gpg-sign=<key-id>`) then do_sign_commit calls the get_signing_key function
(commit.c around line 951).
2. The get_signing_key function will return the signing key set up when the
git_gpg_config function (gpg-interface.c) is called from the porcelain setup
or it will return the committer's name and email. The git_gpg_config
function parses the .gitconfig for the gpg.* config keys and if the
user.signing key setting exists, the key id will be stored and later
returned from get_signing_key.
3. The do_sign_commit now calls sign_buffer (commit.c around line 952) to sign
the commit information (e.g. hash of parent commits, author, committer,
encoding, extra header information, then a newline followed by the commit
message).
4. The sign_buffer function initializes the child_process struct with the
correct executable name and parameters to run the external tool to sign the
data and to get a machine parseable status back.
5. When running GPG, Git uses teh following command line to sign a commit:
`gpg --status-fd=2 -bsau <key-id>`. The `-b` option tells GPG to create a
"detached" signature that only contains the signature data and not the data
that was signed. The `-s` option tells GPG to do the sign operation. The
`-a` option tells GPG to "armor" the output as ASCII instead of binary. The
`-u` option tells GPG that the <key-id> follows the `-u` option specifying
which key to use to sign the data.
6. The sign_buffer function then executes the child process gathering the
resulting signature and status information. It does a string match against
the status data looking for "\n[GNUPG:] SIG_CREATED " to determine if the
signing operation succeeded.
7. If the signing was successful, the do_sign_commit function then inserts the
signature as a "gpgsig" header in the commit data.
#### Tags
0. Signing tags is much more simple. If the sign flag `-s` is passed to the
`git tag` command, after the tag is built, it is passed to the sign_buffer
function. Currently the `git tag` command does not support specifying the
<key-id> on the command line so step 1 in the Commits section above is not
done. Only the git_gpg_config function is called to load and store the
correct signing key id before calling the sign_buffer function.
1. The steps 3-6 in the above Commits section are the same.
2. After the sign_buffer returns, if it was successful, the signature data is
simply appended to the end of the tag data.
### Signature Verification
The important part here is the way Git interfaces with external tools. When it
shells out to GPG, it first writes the signature to a temporary file and then
pipes the signed content to GPG over stdin and reads back the machine parseable
status over stdout. Below is a detailed explaination of the execution flow for
verifying signed commits and tags.
#### Commits
0. Commit verification starts in commit.c in the check_commit_signature
function.
1. It first parses the commit buffer looking for the "gpgsig" start sigil and
then extracts the signature data up to the next empty line in the commit
buffer (commit.c parse_signed_commit).
2. It then calls check_signature passing the commit buffer and the signature
buffer (commit.c around line 1099). This is the entry point into the GPG
infrastructure code in gpg-interface.c.
3. The check_signature function initialized the signature check structure
members and then calls verify_signed_buffer (gpg-interface.c around line
196).
4. The verify_signed_buffer first creates a temporary file (mkstemp) and saves
the signature to the file.
5. The verify_signed_buffer then calls get_format_by_sig which does string
matching against the signature buffer to determine if the signature is a GPG
signature ("-----BEGIN PGP SIGNATURE-----") or a GPGSM signature
("-----BEGIN PGP MESSAGE-----").
6. The verify_signed_buffer initializes a child_process struct with the correct
executable name and parameters to run the external tool to verify the
signature and get the machine parseable response.
7. When running GPG, Git uses the following command line:
`gpg --keyid-format=long --status-fd=1 --verify <tempfile with signature> -`
The code pipe-forks using Git's child_process infrastructure and pipes the
commit buffer to the child process' stdin and reads the machine parseable
status outputs from the child process' stdout (--status-fd=1).
8. The verify_signed_buffer then checks for the GPG string "\n[GNUPG:] GOODSIG "
to set the value of the return status to 1/true if the string exists.
9. The check_signature finishes by calling parse_gpg_output to parse the status
output into a signature status result character (e.g. 'U', 'E', 'N') that
mean different things. The meaning is listed in gpg-interface.c around line
97. It also parses out the signature verification lines that show the
creator and the date and the status so that they can be output to the log if
needed.
A. The stack unwinds back to commit.c
#### Tags
0. Tag verification starts in tag.c in the gpg_verify_tag function.
1. It first reads the oid_object_info to make sure the object is a tag before
reading the object file into memory (tag.c around line 47).
2. It then runs run_gpg_verify which first calls parse_signature which searches
for the GPG/GPGSM signature begin strings (e.g. "-----BEGIN PGP SIGNATURE-----")
and extracts the signature up to the first empty line after the signature
begin string.
3. Then the run_gpg_verify calls the check_signature function passing the
tag contents buffer and the signature buffer (tag.c line 29). This is the
main entry point into the GPG infrastructure code in gpg-interface.c.
4. The execution then follows steps 3-9 in the Commit section above.
5. The stack unwinds back to tag.c where the last thing that happens is
printing the signature check status if that was asked for.