Replacing ofBorg with Github actions
Opened this issue · 25 comments
This is one of the two plans to ensure we can also perform github evaluation checks in the future.
See https://discourse.nixos.org/t/infrastructure-announcement-the-future-of-ofborg-your-help-needed/56025
for more information.
To replace OfBorg’s functions with GitHub Actions the following tasks need to be implemented:
- Running evaluation checks on Nixpkgs
- Eval NixOS options.
- Identifying package rebuilds and adding appropriate labels to the repository.
- (Optional) Rebuilding selected packages for Linux/macOS.
I already created a proof of concept pull request here: #352808
Update
We have our first jitsi meeting to coordinate the migration on the 14.11 (today) at 17:00 UTC (18:00 Berlin time) at https://jitsi.lassul.us/nixos-infra
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
evaluation checks takes too many resource. I'm worried about if github action's machine can run it in reasonable time.
@Bot-wxt1221 I managed to run it in 5 minutes for naive nix-env evaluation based on the default.nix entry point and 15 minutes using the same logic that ofborg uses: https://github.com/Mic92/nixpkgs/actions/workflows/eval.yml
Both seem already faster compared to the hours of waiting for the ofborg queue that we experience today.
Also this is not yet the end of the line of optimizations. We still have https://github.com/Mic92/nixpkgs/blob/main/pkgs/top-level/release-attrpaths-superset.nix to split evaluation in smaller parts that can run even in parallel.
Will PR commands like @ofborg build hello
be supported with GitHub action?
I worry that bot accounts like ryantm-r can easily hit the limit of CI. CC @ryantm
Yes it's possible:
name: Trigger on PR Comment
on:
issue_comment:
types: [created]
jobs:
run-on-comment:
if: github.event.issue.pull_request != null && contains(github.event.comment.body, '/build')
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v3
I worry that bot accounts like ryantm-r can easily hit the limit of CI. CC @ryantm
Well. We have to try and see. Just now it's speculation if it works or not.
Good to know, though huge builds like kernel and its modules, chromium and firefox will obviously not work. And we'll possibly have to setup a blacklist else even individual contributors will hit their limits.
According to github doc:
GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the account's plan. Any usage beyond the included amounts is controlled by spending limits.
So maybe we don't need to worry about time?
Good to know, though huge builds like kernel and its modules, chromium and firefox will obviously not work. And we'll possibly have to setup a blacklist else even individual contributors will hit their limits.
You can run builds for 12h. Obviously we should establish some reasonable timeouts to be a good citizen in the ecosystem.
Added a ^ meeting date for this.
Maybe of interest for this issue, at least just for inspiration, but I've also (ab)used GitHub actions to build tests in my project using a dynamically generated matrix. My project uses flakes but this should be adaptable to non-flakes https://github.com/ibizaman/selfhostblocks/blob/main/.github/workflows/build.yaml
This matrix then produces a big list of jobs, one job per test https://github.com/ibizaman/selfhostblocks/actions/runs/11502502422 like so:
See the meeting notes for today's infra meeting where we mainly discussed the CI situation: https://github.com/NixOS/infra/blob/7688f20babbeb27a10e4d8669fffe4b0ed00e17f/docs/meeting-notes/2024-11-14.md
Here is the high-level plan:
- Infinisil wants to take a look at evaluating nixpkgs in github actions to compute the number of changed paths
- Independently we will take a look how we can build packages.
- For the beginning we will just run github actions as they are designed as a pull_request event. This is because it's the most straight forward way and we actually have not validated if we cannot just build everything fast enough without resorting to my initial strategy.
Independently from meeting we also have other discussions about how we can develop ofborg in the future. However this might not happen before February, so we need some alternative solution in the meantime if not longer.
I've opened a draft PR here for evaluating Nixpkgs using GitHub Actions: #356023. For just evaluation (and those only taking 5 minutes on each arch) instead of also building, I don't think we need to do the running-on-forks dance. Building is harder to get, but it's arguably also less important (and very orthogonal to evaluation).
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
One important aspect that ofborg currently provides, and that this issue doesn't mention, is the performance report.
This currently works by evaluating nixpkgs twice, once before the PR and once after.
For the majority of PRs the performance report is not important, but for work on lib
& stdenv
, it can be very important.
The report currently does not report the impact of checkMeta
, something that has lead to a less than stellar review experience since contributors & reviewers don't actually understand the real performance impact.
One important aspect that ofborg currently provides, and that this issue doesn't mention, is the performance report. This currently works by evaluating nixpkgs twice, once before the PR and once after.
For the majority of PRs the performance report is not important, but for work on
lib
&stdenv
, it can be very important.The report currently does not report the impact of
checkMeta
, something that has lead to a less than stellar review experience since contributors & reviewers don't actually understand the real performance impact.
Could that be another on-demand GitHub actions job? We could even run automatically if certain paths has been changed.
Good to know, though huge builds like kernel and its modules, chromium and firefox will obviously not work. And we'll possibly have to setup a blacklist else even individual contributors will hit their limits.
Building linux kernel is fine on Github Actions, the CPU time is sufficient, it takes less than 2 hours to build Jovian-NixOS linux kernel, and Github Actions offer max 6 hours per run.
The only concern is disk space, workarounds:
- Bind mount
/mnt/nix
to/nix
,/mnt
is 66G free by default. - Set
build-dir = /nix/var
innix.conf
, by default nix use/tmp
to hold/build
in the sandbox, and takes up disk space in/
, 20G free, not enough for building linux kernel. - Remove files we don't need, docker images,
/usr/local
,/usr/share/swift
, etc. It's possible to get more than 63G free disk space in/
without affecting nix. - Use BTRFS RAID0 to combine
/
and/mnt
, and enable zstd compression, it's possible to get total 126G free disk space, and should be sufficient for most build tasks.
All of the above workarounds are implemented in https://github.com/azuwis/actions/blob/main/nix/prepare.sh.
Well, expect for 2), which can be set by:
- uses: cachix/install-nix-action@v30
with:
extra_nix_config: |
build-dir = /nix/var
One important aspect that ofborg currently provides, and that this issue doesn't mention, is the performance report. This currently works by evaluating nixpkgs twice, once before the PR and once after.
...Could that be another on-demand GitHub actions job? We could even run automatically if certain paths has been changed.
Sounds good to me.
Building linux kernel is fine on Github Actions, the CPU time is sufficient, it takes less than 2 hours to build Jovian-NixOS linux kernel, and Github Actions offer max 6 hours per run.
I am concerned about building the kernel modules (both in tree and out of tree).
Building linux kernel is fine on Github Actions, the CPU time is sufficient, it takes less than 2 hours to build Jovian-NixOS linux kernel, and Github Actions offer max 6 hours per run.
I am concerned about building the kernel modules (both in tree and out of tree).
Well. We should be quickly able to filter out and blacklist packages we don't want to build once the source of truth lives in the repository? Also we can actually stop github actions, which was not possible with ofborg builds.
Maybe of interest for this issue, at least just for inspiration, but I've also (ab)used GitHub actions to build tests in my project using a dynamically generated matrix. My project uses flakes but this should be adaptable to non-flakes https://github.com/ibizaman/selfhostblocks/blob/main/.github/workflows/build.yaml This matrix then produces a big list of jobs, one job per test https://github.com/ibizaman/selfhostblocks/actions/runs/11502502422 like so:
@ibizaman did you see this? https://github.com/thecaralice/flake-gha
IDK how useful this may be in this endeavour but I was playing on adding check status in Nixpkgs commits. No success. It fails with error 404. If I point to my fork it works but don't appear in the PR. If a workflow in a user needs to set a status it would require some kind of proxy, that is pretty doable using something like Cloudflare Workers basically for free.
I think this behavior is expected.
Evaluation on github works on a charm! Evaluation takes 5 minutes with ~3 minutes spent on actual eval.
Minor problem: https://github.com/NixOS/nixpkgs/actions/runs/11940360927/job/33282898414#step:5:334
Should be fixable by just decreasing chunkSize here though:
nixpkgs/.github/workflows/eval.yml
Lines 92 to 93 in 43c3e8b