thoughtpolice/buck2-nix

Use of upstream buck2-prelude

ndmitchell opened this issue · 1 comments

Very cool project :)

I wondered if you'd considered whether it would be possible to reuse the open source upstream Buck2 prelude as well? It seems that Nix is a great source for toolchain definitions, but reimplementing things like Rust/Haskell/OCaml/C++ etc on top separately from the upstream prelude would be a lot of work.

Happy to chat further over video conference or similar if that's easier, and happy to answer any questions you have about Buck2. It's still early days, but integration with Nix is super cool.

I don't think it would be impossible, but it might be a bit difficult. There's a lot of surface area to cover. But before anything else, I think the biggest motivating reason I started from scratch here is:

  • To learn what Buck2 is capable of doing, and what it isn't capable of doing. Writing a prelude from scratch basically forces me to understand the conceptual model. Also, I would probably feel hopeless trying to debug the prelude while buck2 is still moving so rapidly and trying to modify it myself. Having my own prelude seriously reduces "surface area" here while I'm wading in.

Nix is actually kind of an accessory, in this sense. I know Nix very well, so the "overhead" of me adapting it to my needs is very minimal. And if I want to drive something like Buck from scratch, I need a set of tools. And there's no better place to download my set of tools from. It's like: it just so happens that peanut butter and jelly go well together, but nobody set out to make that happen a priori.


So with that said, I think adapting the upstream toolchain definitions to Nixpkgs is possible. But first we need to understand the prior work and design space here a little.

This toolchain is an experiment in the sense it inverts some of my prior design choices in this space. For example, I had many private projects that used Shake. But they did a kind of thing where nix would populate the environment, and shake executes in that environment. So you'd go into a directory, type nix shell, and then you could run the commands like shakebuild all and build the all target. That was the interactive loop. And when the CI system built things, or you wanted to produce packages, you ran nix build, which built the whole thing from scratch, under the nix daemon, and all that jazz. Nix provides the hermetic environment. The tagline here is "Nix drives the build system", and Nix is what the user interacts with from this perspective.

This different design is informed by one I saw at a previous employer and some other prior art. In short, we reverse the tagline, instead, "The build system drives Nix". So the idea here is that instead, we use buck to invoke the Nix tools, and download things, and run toolchains provided by Nix. These toolchains can describe what environment variables they need set, how to do stuff, etc. But ultimately Buck now wraps Nix and drives it to build the system to completion.

Nix actually does two things in this repository: it compiles buck2 and some other tools, and puts them in $PATH. When you shell into it and direnv kicks in, that happens. This "shim layer" provides the development environment. But I also uses Nix to describe the toolchains, which are driven purely by Buck on-demand. This is the "toolchain" layer. So, conceptually these are different pieces of code for different use cases. Nix is just so useful for provisioning tools in a development environment, it makes sense to use it to do things like provide buck2 executables (the "shim"), rather than rely on the user to install a compatible copy themselves. It's important to distinguish these.

This inversion has a number of consequences:

  • Users are no longer exposed to Nix nearly at all. Instead they are exposed to Buck, and Buck has a similar lexicon and "book by its cover"-look as other systems like Bazel. They share Starlark, have a similar concept of targets and mapping the filesystem hierarchy as part of the dependency name, etc. This has upsides and downsides.
    • I have used Nix for many many years. I love it to death. But it's a difficult tool to use correctly and the language and its abstractions and the nixpkgs codebase are a lot to understand. Part of this experiment is asking how much more friendly Starlark is to users, and I suspect it ranges from "a little" to "a lot". That isn't the most important question to ask, but it's one of them.
    • No matter how much I hate it, I really do think Starlark, thanks just to the "It's kinda python!" trenchcoat, really puts buck in a better light than Nix, which people have to learn.
    • This criticism is really shared to Shake, and I suspect one you're familiar with frequently. The greatest strength (Haskell) is the greatest weakness (Haskell), even if everything else was overcome.
  • The incremental nature of buck is the default, rather than the "whole sale" build of Nix.
    • The user is more exposed to buck, so they just use buck all the times, and buck does a great job at this when rebuilding in the fast and slow paths.
    • Most nix users rely on nix build when nix is at the forefront. But this "whole sale" build really requires granularity in the derivation structure to achieve good performance, and without content-addressable derivations, features like early cut-off don't exist.
      • For example, a single change to the buck2 nix expression in this repository causes a 20min CI build time spike, because it recompiles all Cargo packages every time. You need to have every Cargo package become an individual Nix derivation, but that requires a lot of magic and there still isn't a good solution to it today. nixpkgs probably wastes ungodly amounts of compute on this.
      • Unfortunately, CA derivations are still buggy; I tried enabling them several times here. Until they're really working (and ideally the default), and I really want them to work, early cut-off will always be a major advantage to Buck.
    • Buck in the long run will have a number of bigger benefits like remote cache hits to improve the performance, where a fully signed nix cache for every person or developer or build isn't really appropriate.
    • But there's a cost to all this, which is...
  • There is a lack of hermeticity that Nix provides. I don't exactly know what Buck does, but Nix goes to extreme efforts to isolate builds and I honestly doubt Buck can provide the same level of "fidelity" that Nix does, all from years of experience. From using separate user/groups to build things, filesystem namespaces and cgroup isolation, etc. Almost nothing else does it as thoroughly IME.
    • For example, I don't even think buck unsets PATH or anything before trying to invoke something. Why would it? That's up to the rules. So it basically fails this test immediately, because any rule can use the ambient filesystem. It's simply too easy to get implicit assumptions on this wrong.
    • It's extremely easy to get this wrong in subtle ways; I can't count the number of times I've seen gcc accidentally link a shared library, one from /usr/lib and one /nix/store into the same binary, only for it to fantastically explode later.
    • But this hermeticity comes with a cost. Incremental builds in Nix require features like recursive nix, and generally evaluation and realizing store derivations is expensive.
    • But you can overcome this: just write a default.nix to run buck build ... inside a Nix sandbox, and then run this in your CI system like, once a day or whatever. This isn't what most people should use, it completely kills incrementalism at the build level but full rebuilds like this are often required to truly suss out things like reproducibility issues or hidden dependencies on the ambient filesystem (oh, I was accidentally relying on the "which" command to be installed, which came from /usr/bin, stuff like that).
    • Doing that is probably the better choice in the long run because you're just making a machine run some CI, rather than making users rebuild constantly. You can also invalidate rules in various ways and commit those changes to force rebuilds if something happens (e.g. something like a more granular version of shakeVersion can probably be built completely in Starlark, I think?)
  • Many components of Nix must now be driven by Starlark
    • This just requires a lot of complex and dirty code to make the two interact together, as you note.
    • This repository fixes the set of toolchains and nixpkgs version, to reduce drift, and isolate the user from their own choices in toolchains.

There are probably others. Anyway, this is getting long! So I'll try to wrap up the next section quick.

Can the existing prelude work with Nix? ("Nix drives the tools")

Honestly? I think so, probably. If you simply populate a nix-shell with the tools you want, the prelude might work today! I haven't actually tried it. This is the "Nix drives the tools" approach. This can probably work because Nix environments for "a language" generally try to emulate the semantics of non-Nix environments as closely as possible. So they'll set up PYTHONPATH etc for you.

But it's tricky, because things like PYTHONPATH require special care in Nix. It's not enough to have a rule like python_binary(name = ..., deps = [ "//python:click" ]), because you also need to set up your environment with a Nix Python interpreter, combined with Nix Python libraries, which ensures sure their PYTHONPATH is configured properly. So you have to specify both of these things, once for Nix and once for Buck.

This actually isn't too bad of a deal, and it's basically what everyone else already does. For example many Python packages inside Nixpkgs have to specify their dependencies at the Nix level and at the Python level in a poetry file or whatever. It's just the way it is. The inverted design of this repository might seek to overcome this burden, but it will come with more complexity in the Starlark layer.

But can it work? I think so. The problem of course is in the details, and having to align the surface area of buck-prelude with the surface area of nixpkgs, which is where all the abstractions lie.

Summary

Too much to summarize and I wrote enough already. If you'd like to have a video chat sometime, maybe with a vscode session, that'd be great, and I can go over some of my thoughts! Please let me know and maybe we can squeeze it in sometime, though it's busy for me.