Split the infra team in two
zimbatm opened this issue Β· 23 comments
Is your feature request related to a problem? Please describe.
Right now, the infra team is understaffed. We regularly need to deploy more services, and that isn't getting served because of the lack of capacity.
The main reason for keeping the infra team small is that everybody on the team has access to the binary cache signing key. Losing that key would allow an attacker to distribute malicious software with our blessing and open new vectors of attack.
Edit: actually, the more significant problem that I see is that people tend to host things left and right, and then the NixOS project starts becoming dependent on those single-deployer services. Eg: I use https://nixpk.gs/ quite a bit. In order to combat this, the best is to have a common infrastructure where it's relatively easy to deploy onto.
Describe the solution you'd like
Split the infra team in two;
- The build team that takes care of the build infrastructure.
- The infra team that serves the community on all the other infrastructure fronts.
The build team
The build team should have segregated access to the various components:
- Fastly
- AWS S3
- Hydra
- Equinix Metal runners
- ???
Most importantly, nobody else should be able to access the cache.nixos.org signing key.
Only super-trusted members should be able to gain access to that team.
The infra team
The infra team should have access to the rest to unblock and enable the community.
- Domain management so they can bind new hostnames
- Hosting provider so we can start deploying our things
- Netlify
- Cloudflare
- Email forwarding
- the monitoring infrastructure
- Bitwarden
- Ofborg?
This reduces the barrier of entry to the normal infra team, and also provides a ramp for infra team members to get promoted to build infra as we get to know them.
In order to achieve that, we would split the nixos-org-configuration repo in two as well.
Describe alternatives you've considered
Build a signing service that holds the signing key. Some sort of KMS service to further cordon off the key. But that's more work.
Additional context
A recent example: #52 (comment)
From the looks of it, the build infra team is everybody at Determinate Systems + vcunat and me.
In the "normal" (TODO: find a better term) team, we could have things like:
- the domain configuration, to bind new hostnames
- host deployment to start self-hosting more things (eg: the mailing list)
- Email forwarding
- the monitoring infrastructure
Hey,
As initiated in #52, I'd be interested in helping building this second team :)
I'm also thinking of Hexa for the infra team
Fine, I'm in. I think we should get this thing going sooner rather than later.
@zimbatm How do you see execution for this? Can we do something to help?
I need somebody to lead this effort and that starts pulling nixos-org-configurations apart. Ideally, we get all of the infra in one repo, and Hydra and the build farm in the other repo. We'll probably need separate hardware providers, or at least different Packet accounts as well.
I need somebody to lead this effort and that starts pulling nixos-org-configurations apart. Ideally, we get all of the infra in one repo, and Hydra and the build farm in the other repo. We'll probably need separate hardware providers, or at least different Packet accounts as well.
I'm available to do that if nobody else wants to be the lead :)
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/notes-on-developing-a-marketing-team-manifesto/28622/1
I got an email this week from Scaleway that they are pausing their OSS program for 2023
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
The main reason for keeping the infra team small is that everybody on the team has access to the binary cache signing key. Losing that key would allow an attacker to distribute malicious software with our blessing and open new vectors of attack.
Would it make sense to use a sharded key or multisig scheme? Like, at least two people on the team need to sign a package for it to be considered authentic.
The builds are signed by the infrastructure automatically, not by people. Like, lots of them every second minute.
I have been using NixOS for years now, would love to contribute back but don't know how to get started. I have professional experience with operating infra at scale, and would love to give a hand to the new team! How could I get the wheels running?
Currently blocked on getting the approval of the Foundation board and clarifying how we delineate the responsibilities and access of both teams. EDIT: Looks like I misunderstood the situation, there is already a consensus
I am going to propose that the new team will have the necessary credentials, or at least CI access to apply changes in the terraform sub-folder over here: https://github.com/NixOS/nixos-org-configurations/tree/master/terraform
I'm confused what's the foundation role here, we've decided to split the infra team by keeping critical things to the build team.
Ok, we now have a @NixOS/infra-build team specifically for the build infrastructure. Invited @mweinelt and @JulienMalka to the @NixOS/infra team.
I want to keep things open to @xvello as well but we don't know you really well, so feel free to hang out in the infra Matrix channel. Another good way to start is to send PRs to the https://github.com/nixos/nixos-org-configurations repo.
I am interested into joining the effort too @zimbatm.
Also note that ofborg is part of the Foundation's EM account, so I'd be hesitant to add it to the purview of the "standard" infra team and not the build infra team.
It's great to see all this enthusiasm. ckie and K900 also proposed to help on Matrix. And we have yet to advertise on Discourse.
I added @cole-h as he is already a de-facto member. Waiting to stabilize things a bit before giving access to more people.
The next step would be to make yourself comfortable with https://github.com/NixOS/nixos-org-configurations. The repo has multiple layers of historical dust that could be cleaned up. Make yourself at home. We might want to re-structure it in build / infra folders to better delineate the accesses. Update the README, ... These are conversations we can have over there.
Happy to see there's enough people stepping up! As life is a bit hectic right now I'll leave the new team to settle, and will reach out on Matrix later this year. Thanks for keeping the project running π
Closing as the infra team has already bootstrapped