jelmer/dulwich

verification of commit signatures with SSH key

Opened this issue ยท 13 comments

Discussed in #1391

Originally posted by castedo October 20, 2024
I am planning to migrate from using GitPython to Dulwich, but I will need to be able to verify commit signatures with SSH keys.

Am I correct that Dulwich does not have this functionality? It looks like tag signatures with GPG keys is only supported.

Are there any libraries or existing Python code that can do this? I've searched long and far and it seems the only Python code that replicates the core functionality of the type of SSH key signatures with git is https://github.com/grawity/ssh-datasign (thank you @grawity!) This is the SSHSIG type signature implemented by ssh-keygen -Y verify which is what git uses for SSH key signatures (nice blog post).

I've created https://gitlab.com/perm.pub/dulwich-sshsig-union for testing, coordination, and a bit of experimentation.

I somehow failed to recognize that my goal of switching hidos from GitPython to Dulwich will also require signing in addition verification. ๐Ÿคฆ So the mission of this thread has expanded to include signing too. But the code from grawity should make this relatively easy if I'm already doing all the work to test verification well.

Quick update on my thoughts, now that I've gotten verification partially working with hidos/sshsiglib/dulwich. I'm now planning to have hidos using Dulwich this month. But I think hidos is going to continue depending on GitPython & classic git for creating/amending/writing document succession for a long time. But reading document successions will be via Dulwich. So verification with Dulwich is high priority but signing commits via Dulwich is low priority for me for awhile. I still want to work towards getting signing into Dulwich too, but it's just not as urgent. I hope signing isn't that much work once verification is working.

#1431 is first subtask for SSH key based commit verification.

@jelmer Here's a quick update on my current thinking. I'm thinking of releasing version 2.0 of the hidos package this month and including a "vendored" library of sshsig inside the hidos package, but not as a submodule of hidos. sshsig and hidos are separate top level modules. Then as a follow-on stage, some of the code can be copied and/or sshsig can turn into a separate stand-alone package. Nobody that wants just the sshsig functionality will also want hidos.

sshsig code is also currently in a stand alone repository, currently at https://gitlab.com/perm.pub/sshsiglib/. For now, this repo is also submodule inside inside the hidos repository so that it can be packaged/vendored along with hidos 2.0.

I'll post some more details here this week on the sshsig and dulwich functions and types that hidos is calling which hopefully can provide some insights and guidance on improved approaches in other dulwich applications.

@jelmer I've just released https://pypi.org/project/hidos/2.0.1/. This package includes and installs an sshsig0 module that is sourced via git submodule from https://gitlab.com/perm.pub/sshsiglib/.

After doing some work on dulwich my currently thinking is to release an independent separate sshsig Python package. With the new Commit.raw_without_sig method it is now super easy to check that a git commit is properly signed with an SSH key and get that SSH public key using a new sshsig package like this:

from sshsig import check_signature
from dulwich.objects import Commit

c : Commit = ...

pub_key = check_signature(c.raw_without_sig(), c.gpgsig)

Whether that pub_key SSH public key is valid or not I see as highly application specific and outside to scope of any general purpose code I plan on writing. The hidos package has its own application-specific policy for what public keys are "valid".

I also think "verification" of what public keys are acceptable is at a much higher level and application specific level than what Dulwich should do (just my take and opinion). Dulwich provides Commit.raw_without_sig and Commit.gpgsig and that's a great interface for Dulwich to provide.

I've learned a ton reading and using this repository that you have been working on. Thanks for sharing and open-sourcing! If you have suggestions for what I add to this new sshsig package feel free to share. I've already been copying various bits and pieces here and there. Feel free to fork sshsig but just let me know so I adjust my plans (such as holding off from moving from unittest to pytest). Cheers!

@castedo I think releasing sshsig as a python package would make sense. I'm happy to maintain it, or perhaps we could comaintain it?

Then Dulwich could (optionally) pull in sshsig in the same way that it currently optionally imports gpg to verify signatures. Manually verifying signatures from Dulwich consumers is reasonable but I think we should also provide a higher level API for convenience.

sshsig as a python package would make sense

@jelmer OK, we both think a separate sshsig Python package make sense. I'm currently planning to name it sshsig but at first I had thought sshsiglib. Let me know if you think one or a 3rd is a better package name.

I'm happy to maintain it

We should try and get on a path where you are the primary maintainer long-term. You have more experience being a maintainer and my hunch is you are more eager to be a maintainer than I am. I definitely want to help reduce the bus factor by making it as easy as possible for you to maintain it if I am not doing enough work to make it an acceptable package and dependency for Dulwich.

I'll start creating new discussion/issue/work items in either GitHub or GitLab.

Do you have a preferences between GitHub versus GitLab?

To me they are both great and terrible in their own special ways. I just try to use both, with a bias toward GitLab unless there's a clear advantage to GitHub.

My preference would be for "sshsig" - simplify because every package in Python is essentially a library so the "lib" bit seems superfluous.

If the expectation is that sshsig is primarily an add-on library for Dulwich, then I'd prefer to keep it on GitHub. I'm not necessarily the biggest fan of GitHub, but that's where the main dulwich repository is today.

As you mention, having more maintainers is simply good for the bus factor. Hopefully there isn't a lot of ongoing maintenance necessary, but we'll have to see. Either way, it's good to have multiple people who can e.g. hit merge on PRs.

OK sshsig it is and on GitHub.

If the expectation is that sshsig is primarily an add-on library for Dulwich

Realistically in the short-term, sshsig is only getting used with Dulwich however my hope is that sshsig will be useful to other applications that aren't git or Dulwich specific. And in the long-term it's definitely possible that my hope of other applications never happens. But now that I understand SSHSIG pretty well, it really is quite independent of git, and with a natural API that is totally independent of git.

also provide a higher level API for convenience

Regarding a high level API, I don't doubt that there is some kind of higher level API that will be convenient for some Dulwich users. It's just not clear to me for which application/users that API would be and thus what it looks like. This is partly because I would not use it from hidos. Calling Commit.raw_without_sig() is a very convenient API to call from hidos.

It's also not clear to me that Dulwich can provide this convenient higher level API. For example, when I think about what GitHub is probably doing to show "Verified" for some commits, it's likely using internal high-level APIs that Dulwich can not provide or currently can't know how to provide.

Regarding the existing API for a 'verify` with a boolean true/false result and an optional list of keys, it seems to me a misleading function name and functionality, at least in the case of SSH signatures.

So overall, a high-level API opens up a can of worms of what is the high-level task that a Dulwich user is actually trying to achieve. There isn't a simple answer to what makes an SSH signature on a git commit "valid". There is no information inside an SSH signature alone that connects it to the identity of the git author or git committer. And there are all kinds of concerns about expiration dates, what types of public key are OK or not (are DSA keys acceptable or not, some say no ... are RSA keys of few bits OK? etc... etc...). Does a high-level API need to provide control for all these things? So much depends on the application.

also provide a higher level API for convenience

Regarding a high level API, I don't doubt that there is some kind of higher level API that will be convenient for some Dulwich users. It's just not clear to me for which application/users that API would be and thus what it looks like. This is partly because I would not use it from hidos. Calling Commit.raw_without_sig() is a very convenient API to call from hidos.

It's also not clear to me that Dulwich can provide this convenient higher level API. For example, when I think about what GitHub is probably doing to show "Verified" for some commits, it's likely using internal high-level APIs that Dulwich can not provide or currently can't know how to provide.

Regarding the existing API for a 'verify` with a boolean true/false result and an optional list of keys, it seems to me a misleading function name and functionality, at least in the case of SSH signatures.

So overall, a high-level API opens up a can of worms of what is the high-level task that a Dulwich user is actually trying to achieve. There isn't a simple answer to what makes an SSH signature on a git commit "valid". There is no information inside an SSH signature alone that connects it to the identity of the git author or git committer. And there are all kinds of concerns about expiration dates, what types of public key are OK or not (are DSA keys acceptable or not, some say no ... are RSA keys of few bits OK? etc... etc...). Does a high-level API need to provide control for all these things? So much depends on the application.

The git command-line tool can also verify signatures, so I don't see how Dulwich is any different in that regard - it should be looking at the same data. Initially users may have to pass in some of the context (such as keys) until dulwich supports reading that context from the same places C git does - users shouldn't have to reimplement all of that logic.

The support for parsing allowed signers is git-specific and should be a part of dulwich, not sshsig. Applying the rules in such a file is also non-trivial and you'd want support from dulwich to apply them.

The git command-line tool can also verify signatures

I was referring to a higher level API that would be convenient. The git CLI is not a convenient high level API. I'm not sure why anybody would want that interface that the git CLI provides. If somebody really wants to have exactly that interface that the git CLI provides I recommend they call out to the git CLI as a subprocess.

support for parsing allowed signers is git-specific and should be a part of dulwich, not sshsig

Yeah, that makes sense. Especially since you are eager to have Dulwich emulate git more closely in this area. I can help you get close to that goal, but only partially.

A minor pedantic point I want to make: The allowed signers file format parsing is most definitely not git-specific. It's ssh-keygen specific. I wish the allowed signers file format was git-specific because the developers on both sides of the git and ssh-keygen sides of the implementation would have picked a file format and algorithm that is sensible. Right now it just seems like a strange hack. I'm guessing the strangeness results from the git and ssh-keygen sides being implemented at different times for different priorities and not that much communication between the two sides.

But pedantic points aside, even though SSHSIG signatures and ssh-keygen are from the same SSH codebase and not from the git project, I agree it makes sense for the allowed signers parsing to happen in Dulwich, at least for the foreseeable future.