managarm/xbstrap

Add versioning based on VCS information

ArsenArsen opened this issue · 30 comments

some packages will need to generate their version dynamically, based on information from git repositories, one of these packages is managarm-system, another is mlibc

naturally, this state of things should be considered temporary, and proper versioning should be introduced into mlibc and managarm, in the meanwhile however, arch (for once) has a decent solution: https://wiki.archlinux.org/index.php/VCS_package_guidelines#The_pkgver()_function and specifically

pkgver() {
  cd "$pkgname"
  printf "r%s.%s" "$(git rev-list --count HEAD)" "$(git rev-parse --short HEAD)"
}

this is of high priority: tonigts CI run will be largely futile without it

I assume that without this addition, we will just not rebuild version 0.0_0 packages?

indeed, and if we simply mark some packages as always out of date, xbps won't know about it, so the right fix is to autogenerate versions

Alright, I will try to implement the git rev-list --count strategy as soon as possible.

git rev-parse for additional information would also be nice, if it's not too much to ask for (rev-parse on head + rev-count on the result of that should have no major edge cases)

I pushed a @VCS_NUM_COMMITS@ substitution to master but now I am having second thoughts: this requires us to pull the sources and to patch them before we can determine the version.

theoretically, we could probably bend git into operating directly on the remote (just need to follow some hashes by uploading some packs), and I am not sure if patches should be accounted for in the substitution, those could be handled by bumping revision (rather than version, revision is incremented when the build script changes but the version does not, which would include adding or removing patches), but regardless, that was my initial solution (this should be lazily resolved)

to extend on that I'd have:

  1. added a compute_version field (similarly formatted to how steps are formatted)
  2. fetched and patched sources if that's set
  3. ran the script/program in compute_version

I'm not sure how that would interact with the regenerate step, though

The fact that we need to download sources is still very awkward. xbbs would first need to run something like xbstrap patch <sources with dynamic versions> before compute-graph. The workers would also need to download the sources of dependencies to determine their versions. In general this is all quite tedious because we assume in various places that the version is available without additional computation (certainly without running build steps).

downloading the sources can be avoided, in theory, using git-upload-pack, and I don't think any sources need to be patched. however, what does need to happen is that the revisions of sources for these packages will have to be outputted by xbstrap-pipeline to be sent to the workers

So we need an additional file (or section of the bootstrap-site.yml) to pass the VCS information to xbstrap.

right, true, which might tie in into the mirror/source override stuff that was talked about being added to bootstrap-site.yml

Okay, then I propose the following solution:

In bootstrap.yml, we mark some packages as rolling_version: true. We add a new file bootstrap-commits.yml that fixes a commit for each package that has a rolling version. Instead of @VCS_NUM_COMMITS@, we use a @ROLLING_COMMIT@ that is also specified in bootstrap-commits.yml. xbbs needs to fetch all sources with rolling versions on the coordinator, build a bootstrap-commits.yml (before compute-graph) and pass it to the workers.

I think it's cleaner to not duplicate the fetch logic, leaving them in xbstrap should be fine? I'm not sure why adding the commit to the graph (besides the version) is any worse?

The main issue is that we need the commits before compute-graph. xbbs can of course use xbstrap to fetch the packages, it doesn't need to call git itself.

how does xbbs get the list of packages with the rolling version? what prevents compute-graph from getting those packages and emitting the commits for xbbs to send to the workers?

I consider adding some xbstrap rolling-versions fetch + xbstrap rolling-versions show commands to fetch packages with rolling versions and to perform the rev-list --count.

Nothing prevents compute-graph from doing it in principle but it is quite unexpected (and probably not desired for use cases outside of xbbs) that compute-graph runs build steps.

The corresponding changes are on master.

  • xbstrap rolling-versions fetch to fetch all sources with rolling versions.
  • xbstrap rolling-versions determine to get the rolling IDs:
$ xbstrap rolling-versions determine
bragi: '68'
cralgo: '1'
cxxshim: '30'
lai: '426'
libarch: '55'
managarm: '2721'
mini-lspci: '8'
mlibc: '1251'
perf-tests: '9'
  • boostrap-commits.yml in the source directory (not build directory!) to pass this information to xbstrap-pipeline and the workers:
commits:
    bragi:
        rolling_id: '68'
    cralgo:
        rolling_id: '1'
    cxxshim:
        rolling_id: '30'
    lai:
        rolling_id: '426'
    libarch:
        rolling_id: '55'
    managarm:
        rolling_id: '2721'
    mini-lspci:
        rolling_id: '8'
    mlibc:
        rolling_id: '1251'
    perf-tests:
        rolling_id: '9'

Rolling versions disable shallow fetching (obviously) and there is some sanity checking (in the pack phase) to make sure that bootstrap-commits.yml matches the source directory that we're building form.

why are we converting commit hashes to rolling ids rather than keeping both

We are not. Rolling IDs are used to construct the version (they equal git rev-list --count). On the other hand, we still use normal tags / branches to check out commits.

Enabling rolling_version: true for packages like managarm-system in Managarm is now blocked on xbbs support for bootstrap-commits.yml.

will get to that soon enough then, and in #35 (comment) I was asking because I don't see how xbstrap gets the right revision (eg imagine determining the version, then a push to managarm happening, then a build of managarm happening, that would get the wrong revision, no?)

[...] (eg imagine determining the version, then a push to managarm happening, then a build of managarm happening, that would get the wrong revision, no?)

That can happen right now but it is a separate issue - this could happen even for non-rolling versions (e.g., if the same source is required by multiple packages). We can fix it by adding another field for the SHA1 hash to bootstrap-commits.yml but let us first fix the versioning scheme.

Note that in the situation that you describe the build will fail instead of breaking silently if rolling_version: true is set.

would it not just break silently (break meaning produce the wrong version)?

No, because we match git rev-list --count against the info from bootstrap-commits.yml before we call xbps-create.

ah, good, that's good enough for me.

what about adding a revision hash to the pkgver as well? it would be useful to know

I don't know if that's a good idea because it is not an increasing number.

Maybe it should go to the -B, --built-with <string> argument to xbps-create instead? Or the -c <change log>?

-B "built with xbstrap from commit $hash" would be good, -c should be more rigorous

now that changes relating to this are in production and are shown as working, perhaps it is time to consider adding commit hashes to the mix, since we would want those to identify revisions and reduce build failures (there's a four hour window for a push to happen to break a build)

Alright, let's close this issue and open another one.