NixOS/foundation

Github Request List - Feedback/Issues/Features

refroni opened this issue Β· 45 comments

We will be meeting recurringly with Github to go over items from either side. Please add in anything that might be of interest to bring up/discuss on the topic of Github and NixOS.

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-github-collaboration/23432/11

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-github-collaboration/23432/1

The Nix Community is currently working with the OCI specification body to prototype a non-conflicting-on-prefix: "/prefix" layer type that can be a nix store path.

Constructing an overlayfs is unduely slow and pidgeon-holing everything into the 127 layer limit decreases the chance of deduplication of data.

A non-conflicting layer type can bypass the overlayfs assembly and efficiently query dependencies (i.e. image layers).

This can be very interesting for GitHub as a means to increase data-reuse and lower bandwith (and also speed up action assembly).

It requires to serve a Nix Cache that can serve the individual store paths and a patched registry (e.g. GHCR) which is capable of relaying these layer types accordingly (e.g. to the Nix Cache).

Since GitHub has to maintain a pretty massive build infrastructure, the proposed optimizations would cash in rather quickly.

The NixOS Community can be used to prototype this mechanics and together (with combined political capital) we can make sure before the OCI standards commitee (which are quite open to the idea, already, anyways), that these machanics get included into the official OCI specificiation and that new versions of runtimes would have to start complying to make this principle ubiquitous.

I think nixpkgs could benefit from and help refine issue/PR triage workflow features.

The GH issues/projects improvements have gotten close to what we might need to build a more intentional triage process (i.e., a bin of actionable issues + a fair, manual, non-stalebot mechanism for booting unactionable issues out of that bin until someone provides feedback), but the key features haven't been scheduled--and we could probably do quite a bit better if they were open to more specific requests.

I won't barf more context here, but I've discussed it a few times on the forums https://discourse.nixos.org/t/stale-bot-for-nix-just-closed-a-bunch-of-stuff/18661/3?u=abathur

For a large PR discussion it is non-trivial just to get all the comments on the single page:

  • resolved and unresolved
  • code comments and discussion comments
  • without cutting in the middle

This is, however, still the easiest way, say, to search a statement you vaguely remember. Is improving the situation around this an option?

We have 8.5k open issues on Nix and Nixpkgs and there is no way to make sense of them all. It's impossible to get a big picture and figure out what to work on first if they are displayed and managed as a list. It's the wrong data type to do things at Nix scale.

I have only one wish to GitHub: Please introduce issue dependencies in the data model and at least expose them in the API (ideally with a dedicated call, so we don't have to hammer you with requests to get the whole dependency graph).

There are clumsy tools for this with clumsy UI based on error-prone string wrangling in issue descriptions. Please make this a first class citizen.

Here's the relevant part of my NixCon 2022 talk illustrating the struggle we have right now and how much it would help the project forward.

The monthly pulse report isn't working reliably on the repository nixos/nixpkgs, I get the Unicorn page "the page is taking too long to load" 95% of the time when opening https://github.com/NixOS/nixpkgs/pulse/monthly

Adding KVM support for GitHub actions would be great feature. See: community/community#8305

One issue we have is that we have so many members in the org, it's hard to distinguish what position each user providing feedback has. This makes it hard for newcomers to weigh the value of feedback they get on PRs.

To mitigate that issue, it would be helpful if associated teams were displayed next to a user. At least on the PR comments.

On the @NixOS/cuda-maintainers side of things, two things come to mind:

  1. Merge trains! See https://discourse.nixos.org/t/nixpkgss-current-development-workflow-is-not-sustainable/18741 for context.
  2. GitHub Actions running longer than 6 hours. One of our CI projects (https://github.com/samuela/nixpkgs-upkeep/) frequently runs into timeouts when building large packages like jaxlibWithCuda, torchWithCuda, tensorflowWithCuda, etc. Alternatively having more vCPUs on the free tier runners could also solve the problem.

cc @SomeoneSerge for anything else I might be missing?

Since GitHub Actions were mentioned: CI on GitHub is comfy, and GitHub Actions with Nix are almost a bliss, except for one piece missing: a native way to populate /nix/store/ from some hot cache, instead than fetching the same things from cachix and hydra over and over again

EDIT: I see that @blaggacao suggests something maybe in these lines (RE: "reuse" and "bandwidth")

NobbZ commented

A full clone of the repository downloads several GiB of data. After running a git gc locally the .git folder gets reduced to below 1 GiB.

Is there a way to do this "server side" such that git clone time and data transfered gets reduced?

Since GitHub Actions were mentioned: CI on GitHub is comfy, and GitHub Actions with Nix are almost a bliss, except for one piece missing: a native way to populate /nix/store/ from some hot cache, instead than fetching the same things from cachix and hydra over and over again

EDIT: I see that @blaggacao suggests something maybe in these lines (RE: "reuse" and "bandwidth")

Additionally, the native GitHub cache action does not work properly with /nix/store/ due to its permissions. Fixing this should be an easy target and a big improvement especially for actions that build a lot of small things.

My biggest problem is that people rebasing PRs onto staging through the github UI currently notifies hundreds of maintainers. I get around 20 accidental code reviews assigned per week by people using the GitHub UI instead of following the contributing guide

RFCs are impossible to follow because GitHub hides 90% of the conversations. The code review feature simply doesn't work if there are many reviewers discussing a document

Speaking of the CODEOWNERS feature: not only does it ping too many people at times, it also has the major restriction that one has to have write access to the repository to be able to register as a code owner. This makes kind of sense for most normal repositories, but in our case it is needlessly restrictive. It also makes the feature kind of unusable for teams too.

Now our GH actions are:

runner *.yaml

Can we make it:

runner nix build .#my-awesome-gh-action-yaml-generator
runner run *.yaml

So that I can describe my CI in nix. So that I can generate package deps graph from nix to yaml CI jobs to maximize speed and parallelism.

Thanks everyone for adding more items here. I plan to send an initial list over next week and will update.

In my abundant stupidity (and clumsiness), today I learned that protecting branches by pattern does not forbid creating new branches matching this pattern with a direct push. Could there be an option that only specially authorised users can create such branches?

It would be great if Github served tar.xz or tar.zstd compressed source archives.

Nix relies on downloading tarballs/zipballs of the nixpkgs repo quite heavily. Github currently delivers only DEFLATE compressed source archives (.zip / tar.gz). This is an old an inefficient compression algorithm which slows down downloads on connections with limited bandwidth significantly.

It would be best if github would serve tar.xz compressed archives, but this might not be an option, as it is expensive to compute.

tar.zstd in contrast, with the right settings, is more efficient and more effective than tar.gz.

The fact that, either intentionally or accidentally, replies to existing review discussions written when in the process of a review get published with that review, is very annoying for large PRs.

Not only does it duplicate comments in the UI, but the comment associated with the review (as opposed to the one that shows in the existing discussion) contains no context, not even a reply button like new comments.

It would be great to see this functionality revised/fixed.

actions/toolkit#1253

(If it doensn't get organic attention)

NixOS/nixpkgs#207310
NixOS/nixpkgs#207311

This is a single action firing twice, right? Is it even supposed to be possible? It was bad enough when NixOS/nixpkgs#118661 went (almost? who knows…) all-false-positive because of race conditions, but this seems to be a larger race condition here!

(The request here being Β«could we please get a model for what race conditions are supposed to be possibleΒ», although restricting this set is obviously always welcome)

Bugs

GitHub merges random PRs when they are empty : NixOS/nixpkgs#191665 ; the PR was empty at some point, and it got considered as merged even if the "merge reference" is a completely different commit!

GitHub UI : It uses some sort of Turbolinks to give a SPA feeling, but when searching in nixpkgs, it is often met with long times of waiting, maybe offer a way to perform full page refresh everytime.

API access

I would like to run advanced data analysis of the historical data in nixpkgs, for this, I need (regularly) to fetch the nixpkgs PRs (and ideally issues), but, nixpkgs has 200k PRs, page limits are 100 for default user tokens.
Plus, I encounter a 502 many times when fetching PRs: https://github.com/RaitoBezarius/understanding-nixpkgs/blob/main/notebooks/01.%20What%20is%20the%20security%20story%20in%20NixOS%20%3F.ipynb

I know that some organisations can have tokens with increased page limits, I wonder if this is possible to put some NixOS developers in an allow-list for some of these tokens.

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-foundation-board-meeting-minutes-jan-3-2023/24476/1

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-github-collaboration/23432/15

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-github-meeting-notes-and-updates-january-2023/24762/1

Updated responses from the conversation with github -> https://discourse.nixos.org/t/nixos-github-meeting-notes-and-updates-january-2023/24762

Thank you everyone for collaborating on this! Let's continue the thread as we will be meeting with them quarterly.

It would be nice to have access to the "auto-add to project" workflow for github projects: we mark cuda-related issues with a label, but we would appreciate having just one place to watch

We're a big project, our actions are not running for frivolous reasons, and just a few minutes ago we've apparently run into API rate limits. They affect our labeling, automerging, backporting, manual and editorconfig actions, which are currently failing on all changed PRs.

Examples:

It would be nice if these limits could scale to and with our needs.

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-foundation-update-github-collaboration-april-12/26833/1

Reminder - upcoming meeting on April 12

Ok, this one is a bit ridiculous concerns UX.
There are two things about PR branches that I want to do outside the browser:

  • I want to use builtins.getFlake github:ShamrockLee/nixpkgs/write-multiple-references in nix repl,
  • and I want to git remote add ShamrockLee git@github.com:ShamrockLee/nixpkgs. Note how both schemas happen to follow the same pattern: domain:user/repo/branch. It's fate!

However, when I click on the "copy" button next to ShamrockLee wants to merge 1 commit into NixOS:master from ShamrockLee:write-multiple-references, what do I get? Right, it's ShamrockLee:write-multiple-references which doesn't fit either

EDIT 2023-04-12: @refroni yes, thanks, that's it! Just enabled it for 6.topic: cuda

But also, I did try to elaborate on the previous question about caching: https://discourse.nixos.org/t/nixos-github-meeting-notes-and-updates-january-2023/24762/5?u=sergek

The TLDR is that if they've ever tried to speed up loading docker images in github workflows (by caching them), then maybe they could experiment with extending that to /nix/store, and maybe even follow up by sharing their experiences. Frankly though, I should be absolutely shocked and amazed if that happened

@SomeoneSerge regarding auto-adding, the just added it recently :)
https://docs.github.com/en/issues/planning-and-tracking-with-projects/automating-your-project/adding-items-automatically
Is this what you were looking for?

@mweinelt They are looking into it. As a temporary solution they are willing to bump our rates temporarily whenever we have an expectation for things to increase such as prior to a hack/nixcon/etc.
We'd just need to give them a heads up on the dates
Go from 12500 to 62500

We have two major mass-ping issues that make notifications hard to manage:

  • Changing target branches on PRs or rebasing them incorrectly often results in mass-pinging code owners by accident, as commits which aren't in the target branch become part of the PR (as already mentioned by piegamesde above); some way of limiting this would be good.
  • Anybody can highlight the @NixOS/nixpkgs-maintainers team, which contains many many people. If team mentions and team review requests were gated behind a permission, so that nobody can mention big teams like this. Thank you zimbatm!

Point (2) has been solved by disabling the notification settings on the team level. This has been applied to the nixpkgs-maintainers and nixpkgs-committers groups. Point (1) is still an issue.

Has GitHub said anything about the target branch mass-notification issue?