Distributed build validation via Git + IPFS + PowerShell

Question

Distributed build validation via Git + IPFS + PowerShell

Closed this issue 2 years ago · 2 comments

Answer 1 · 2022-07-23T00:42:48.000Z

Proposal amendments

After more research into the inner workings of Git, I've found that we may be able to lean on Git even more, and massively simplify our setup in the process.

Permission management

It's a solved problem

Most services like GitHub and Azure DevOps already come with built-in permission management for the upstream repository, for both Branches and Tags.

We can rely on the developer to set up this security
This gives more freedom to the validator on where results are pushed to.
Annotated git tags can be signed and verified with GPG (docs), which can be leaned on as a trust system.
This means we can allow contributors with repo push permissions to push build validation tags directly.
For forks, where contributors don't have repo push perms.
- They could push to their own fork, and we could add that as a remote, swap branches, and continue as usual.
  - We should be able to push the tag to our remote on their behalf
  - Will need to pull and swap to a branch from a second remote.
  - Doing this also pushes the commits in the tag to the remote, even if the commits don't exist on a branch.
  - git gc should clean up automatically, in case the fork's branch is never merged.

Preserving validated commits

Tags, branches, and you

Git tags are only done against a specific commit, and cannot point to an branches. This means it's on us to make sure a built/validated commit is preserved in git history when merging branch.

A commit's hash is an immutable object that represents a set of changes, plus additional git information such as author, time, messages, etc, and a link to the parent commit, meaning it indirectly references all previous changes as well. It accurately represents the entire current state of the repo.

Can we build it?

I've verified the following behavior:

If you merge the upstream branch into your current branch first, you can build/validate that commit while still in PR, and as long as history is linear and your commit is preserved in history, people can see that you were able to build successfully!
The above behavior is true even when there are merge conflicts
As long as history is linear, the merge commit can be validated before or after the actual merge and will stay preserved.
(!!!) If history is NOT linear, such as when squashing, the commit that the contributor built/validated won't be included in the upstream branch, and will be lost.
- Solution: squash locally, push only tags for that commit
  - Can we tag a nonexistent commit? - Yes, as long as you can switch to it.
  - Can we push a tag for a nonexistent commit?
    - Yes, even if the branch was deleted
    - Doing this also pushes the commits in the tag to the remote, even if the commits don't exist on a branch.
    - git gc should take care of this automatically
  - Can we predict the squash commit hash? - Yes, if you can pull both branches. Just do the squash locally

Yes, we can

No matter what merge strategy is used, we have the ability to preserve arbitrary information in tags, as long as the code exists on the remote.

The only caveat is that unless we know what merge strategy is going to be used, we'll have to create tags for both the upstream merge commit and a squash commit. We can make this part of the config on the "build validator" side of things.

I'll have an updated graphic soom:tm:., once we're done iterating.

Long term considerations

IPLD Integration

https://github.com/ipfs/go-ipld-git
Would integrate nicely into these scripts without changes
Instead of storing the data in tags, store the CID as a "link" in IPLD
Is there a way we can manipulate Git to make this translation happen automatically in the future? Needs investigation.
If the above is possible while maintaining the link to a Tag, it's MUCH better than throwing the CID into the tag message.
Git allows you to create arbitrary blob objects, create object trees from scratch, and link to either of them in a commit. All of these would work natively with any IPLD translation layer.
So far, I haven't found an easy way to do this. We may need to settle for putting the CID in a structured tag message and make a converter later if needed.

What if we went all in?

If we had unlimited time and resources, what would we make?

This has been moved to Arlodotexe/brain-dump#2

Answer 2 · 2022-07-25T01:56:25.000Z

Since the proposal has obviously gotten out of hand, I've created a new brain dump repo and moved them over.

In order to keep moving Strix forward, I don't plan on working on these things until much later, unless it becomes a requirement.

Distributed build validation via Git + IPFS + PowerShell

Background

Problem

Proposed solution

Scripts

Node settings

Other

Proposal amendments

Permission management

It's a solved problem

Preserving validated commits

Tags, branches, and you

Can we build it?

Yes, we can

Long term considerations

IPLD Integration

What if we went all in?