Design tooling to ship bootloader updates
cgwalters opened this issue ยท 18 comments
Moving this from ostreedev/ostree#1873
We've already hit problems in Silverblue with people having absolutely ancient GRUB2 UEFI binaries, see fwupd/fwupd#2084 (comment)
For FCOS today the bootloader is written at disk image image build time: https://github.com/coreos/coreos-assembler/blob/40c6d44497056b6af308ad7c7c9298a0ead3e975/src/create_disk.sh#L318
And then there is no mechanism we ship to update it (whether automatically or manually for that matter).
One thing that is nice is that since we shipped the aleph version marker we can at least reliably identify the versions of those binaries. (It would also be useful to have a tooling that checksums them and attempts to identify e.g. the RPM package version they came from).
I think my strawman is something like this:
- Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem space, let's call it
bootupd
because we have lots of imagination - Write it in Rust, use e.g. https://github.com/coreos/afterburn/ as an example stub project
bootupd
can be passed a filesystem tree containing expected binaries (for UEFI) and replaces the bits increate_disk.sh
to write them to disk to start, and includes metadata about them, and also knows how to wrapgrub2-install
as needed for the MBR
Now, an important thing that needs to be discussed here is at what cadence we apply bootloader updates. It might be OK to simply do them whenever the distribution makes them, but we might also want to make that optional, because it is likely (without significant engineering effort) to be a "don't
turn off your computer right now" event.
It might be very useful for bootupd
to define its own little "upgrade graph" model, at least the ability to only apply updates if a bootloader is too old rather than doing it on every update.
And client side this should be configurable (and perhaps off by default).
Probably to start the simplest is for bootupd
to take as input a filesystem tree, a lot like /usr/lib/ostree-boot
rather than (like fwupd
and dnf/rpm-ostree
) defining its own mechanism for retrieving content from http etc.
For FCOS/RHCOS we can then e.g. choose to pin the bootloaders separately from the main ostree content, or ship them as they come in.
I think my strawman is something like this:
- Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem > space, let's call it bootupd because we have lots of imagination
- Write it in Rust, use e.g. https://github.com/coreos/afterburn/ as an example stub project
- bootupd can be passed a filesystem tree containing expected binaries (for UEFI) and replaces the bits in create_disk.sh to write them to disk to start, and includes metadata about them, and also knows how to wrap grub2-install as needed for the MBR
This sounds good to me!!
Probably to start the simplest is for bootupd to take as input a filesystem tree, a lot like /usr/lib/ostree-boot rather than (like fwupd and dnf/rpm-ostree) defining its own mechanism for retrieving content from http etc.
Agreed. If there ends up being a benefit from defining other delivery mechanisms it could be added later.
We're going to have to do some work here if we want to ship these kinds of updates. My only question is specific to reuse versus owning end-to-end. Would integrating fwupd
at any level end up adding more work to accomplish this? Put another way, if we provided a firmware update via the proper RHCOS/FCOS delivery mechanism and then use the expected metadata format pointing to the local location (didn't check if file:// is valid, but there are workarounds) does that help? Or should we directly own the process and code start-to-finish as that would a better solution for CoreOS style systems?
It definitely makes sense to touch base with fwupd. Maybe they'd agree to own this space, and it would make some sense, gets interesting then because fwupd would be updating itself.
I believe ChromeOS ships fwupd, but they seem to use different tooling for updating the bootloader:
https://www.chromium.org/chromium-os/firmware-porting-guide/2-concepts
Another huge sub-thread in this is that we may need to try to convert traditional RPM (and other) systems over to using this tool too, otherwise it will pile onto the delta carried here.
Got it. Sounds good. I think starting with the minimal set of functionality, IE:
- The update content is delivered to the system in $WAY and is dropped in $LOCATION
bootupd
applies the content from $LOCATION based on a simple graph- Reboot occurs
is small enough not to write us into a corner as we continue to explore this idea, while still being helpful to those who would like this functionality and could help us test/verify ๐
It's super tempting to do something just to update UEFI, because that boils down to copying files. But, the problem with that is it won't help us when at some point we want to use a feature in our grub.cfg
that isn't supported by e.g. the GRUB in the MBR, and GRUB-in-MBR is used for e.g. OpenShift 4.1 bootimages in big cases like most public clouds. That of course circles into openshift/enhancements#201 - and that's one thread here, we can say anyone who wants bootloader updates needs to use new bootimages (reprovisioning in place existing nodes).
It definitely makes sense to touch base with fwupd. Maybe they'd agree to own this space, and it would make some sense, gets interesting then because fwupd would be updating itself.
๐ - i'm not sure if it makes sense from the maintainers of LVFS/fwupd but having one tool handle it could make a lot of sense. It's already made a lot of progress at being cross distro and universally accepted. Maybe the distros could plug in to a universal model for where to ship the files via package management and then fwupd applies them or something.
It's a good opportunity to break with two unfortunate behaviors: nested mount at /boot/efi, and the persistent mounting of it. The latter is difficult security wise to justify leaving it hanging around all the time. Whatever is responsible for updating things related to the bootloader should be able to mount it, modify, and unmount. As to where to mount it, maybe do it somewhere in /run. I think this should be behavior out of the gate in 1.0 version.
For 2.0 I think it should be multiple device aware, and capable of properly syncing the drives, e.g. the raid1 use case, where both drives need bootloaders and updates.
I'm uncertain about resolving the differences between the sd-boot and Fedora blscfg.mod BLS paradigms; neither of them are accepted upstream GRUB which is also unfortunate. This messiness has made for extra work on the RH/Fedora boot folks.
One thing that is going to get a bit better in F33 GRUB is no longer depending on grubenv to store variables like kernel arguments. They'll go in the BLS snippets directly which is more like the sd-boot/upstream BLS.
- Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem space, let's call it
bootupd
because we have lots of imagination- Write it in Rust, use e.g. https://github.com/coreos/afterburn/ as an example stub project
We discussed the same idea with @vathpela and @gicmo in the past. And even said that rust was likely the saner option, so our goals are very much aligned.
Another huge sub-thread in this is that we may need to try to convert traditional RPM (and other) systems over to using this tool too, otherwise it will pile onto the delta carried here.
I think that makes sense for traditional Fedora as well, since is needed for legacy BIOS x86_64
as you said but also for ppc64le
that has a similar setup. We are currently not updating the bootloader too for those platforms in non-ostree based variants.
Even for EFI it would be good to decouple the package installation from the ESP update for example as @cmurf said to only mount the ESP when needs to be updated. But it would also allow for other future improvements like for example having an A/B update mechanism for the bootloader. Since files won't be copied to the ESP directly as a part of a package update, overwriting the existing EFI binaries.
Maybe can also be used for s390x
since the bootmap has to be updated on each kernel update by calling the zipl
tool. It's true that this is not updating the bootloader per se but still is an action that needs to be taken in order to update the bootloader configuration.
Currently that is done by the ostree
zipl backend but that doesn't feel completely right. This tool might take care of that and allow a sysroot for s390x
to just be configured with bootloader=none
instead of requiring bootloader=zipl
.
I'm uncertain about resolving the differences between the sd-boot and Fedora blccfg.mod BLS
I don't think that this tool should care about BLS at all since ostree
handles that. But only about the things that need to be updated that are not part of the ostree
deployment transaction.
paradigms; neither of them are accepted upstream GRUB which is also unfortunate. This messiness has made for extra work on the RH/Fedora boot folks.
Yes, it's unfortunate that upstream GRUB doesn't like the BLS. Last time we discussed this the maintainers said that maybe we could define a BLSv2 that better aligns with the GRUB configuration and features. I still don't think upstreaming the blscfg
module is a lost battle, but this isn't really relevant to this discussion in my opinion.
One thing that is going to get a bit better in F33 GRUB is no longer depending on grubenv to store variables like kernel arguments. They'll go in the BLS snippets directly which is more like the sd-boot/upstream BLS anyway.
That's correct, using the grubenv to store the cmdline caused more harm than good. But keep in mind that this was only used for traditional Fedora since ostree
manages its own BLS snippets, so only the blscfg
module from the GRUB package is needed there.
Maybe can also be used for s390x since the bootmap has to be updated on each kernel update by calling the zipl tool. ... Currently that is done by the ostree zipl backend but that doesn't feel completely right. This tool might take care of that and allow a sysroot for s390x to just be configured with bootloader=none instead of requiring bootloader=zipl.
If it involves changing the kernel, that is ostree's (or traditional rpm/yum's) domain. So I would strongly prefer the status quo of bootloader=zipl
over teaching ostree how to execute bootupd
since that's tying them together when they should be independent.
So I would strongly prefer the status quo of
bootloader=zipl
over teaching ostree how to executebootupd
since that's tying them together when they should be independent.
Right, I guess bootloader=none
is only suitable for the case when just writing BLS snippets is enough and no other action is needed for the bootloader to parse the new configuration.
For FCOS today the bootloader is written at disk image image build time:
Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem space
It sounds like you're defining "problem space" here to mean that the image builder writes the bootloader and then there's no way of updating that part later.
There is an alternative, wider view that the problem space is that the image builder makes a series of decisions and actions that affect what is written within the installation (beyond the deployment of the ostree), and those aspects may need to be updated later. The bootloader is one such element within that wider problem space.
This is an approach I have been working on exploring here: Updating the OS data that ostree doesn't manage
Maybe this is not such a pressing issue on your side at the moment, given (as discussed there) there is the reprovision-on-every-boot approach that works well for cloud setups, and the Silverblue image build is currently not much more than deploying an ostree and installing a bootloader anyway. But I feel inclined to mention this anyway given that it's largely a Silverblue (non-cloud) case that has put this topic on the table again, plus our past experience at Endless which may have relevance for Silverblue going forward. We originally started considering the problem as a bootloader update thing, but then we enjoyed an uptick in userbase and accumulated a bunch of product history on the way, and now we firmly find ourselves with a larger problem to solve.
Those discussion points aside, there is probably no fundamental incompatibility between a separate bootupd
being envisioned here, and the type of wider solution I'm considering for Endless.
However, one lesson learned along this journey is that you should work hard to avoid there being a separate codepath for installation vs update, that is a recipe for failure. So when building images, the bootloader installation should be done by a call into this new solution. Don't end up with image builder code that installs the bootloader being separately maintained from the bootupd solution that puts the updates in place later.
From that angle, if bootupd
means "boot updater" then that's an imperfection in the name, since it would also be used for installing a bootloader on a blank disk, it's not only the update case. (Or does it mean "bootup daemon"?) Also, if it does handle both installation-on-blank and updating, then that's a bit of a conceptual difference from fwupd
.
In terms of this being a distro-agnostic generic thing, in addition to supporting grub EFI & MBR it would be great if the underlying design would allow this to be cleanly extended to also supporting systemd-boot, Raspberry Pi (i.e. blobs on a special non-ESP FAT partition), and other ARM solutions with good mainline kernel/bootloader support like Rockchip and Amlogic that need special blobs written onto specific sectors at the start of the disk.
Thanks for replying dsd! I agree with all of your points generally, and I want to reply to one:
From that angle, if bootupd means "boot updater" then that's an imperfection in the name, since it would also be used for installing a bootloader on a blank disk, it's not only the update case. (Or does it mean "bootup daemon"?) Also, if it does handle both installation-on-blank and updating, then that's a bit of a conceptual difference from fwupd.
See this comment in the original issue:
bootupd can be passed a filesystem tree containing expected binaries (for UEFI) and replaces the bits in create_disk.sh to write them to disk to start, and includes metadata about them, and also knows how to wrap grub2-install as needed for the MBR
So yes we're totally in sync that bootupd
would need to be responsible for wrapping the installation as well.
Some initial work on this in https://github.com/coreos/bootupd/
We discussed the Boot Hole vulnerability today in the community meeting, and it came up that OSTree doesn't update the bootloader because it can't do so atomically on FAT. I had a chat with @vathpela afterwards who said it might actually be possible. Pasting logs:
<jlebon> re. FAT, did you mean that it is actually possible to have e.g. an atomic rename(2) ?
<pjones> atomic rename(2) is hard
<pjones> actually it might not be that hard.
<pjones> I'll have to look.
<pjones> atomic /copy/ definitely should be doable.
<jlebon> yeah, though i guess it doesn't help if it's multiple files that need to be updated
<pjones> so the thing you do get 100% atomically write(fd, buf, 512) after an lseek(fd, offset_aligned_to_512,
SEEK_SET)
<pjones> which means if nothing else we can make /boot/efi/EFI/fedora/a/ and /boot/efi/EFI/fedora/b/ and
literally a text file /boot/efi/EFI/fedora/lng
<pjones> and lng (for "last known good") say "a" or "b" in it.
<pjones> or if not lng, "current"
<pjones> (I forget which model you need for this)
<jlebon> hmm, how would this work with the EFI firmware though?
<pjones> or even .. you have something like serialized version numbers right?
<pjones> so the trick there is that when we install, we create both of those directories
<jlebon> like, how would it know to look at `current`, then find the files at `/boot/efi/EFI/fedora/$current`
<pjones> and we create boot entries for both of them, and put them both in the boot order
<pjones> updates to BootOrder *should* be atomic so long as the number of entries in the order doesn't change,
but there might be some work in linux and userland we need to do in order to make that true
<pjones> and then we make something in the early startup check BootCurrent to see which one we booted
Is rename(2) atomic on FAT? - on linux-fsdevel@, Oct 2019
I think the fact this is atomic at the VFS level but not at the FAT level, is an acceptable risk. But also, on common consumer SSD and NVMe, concurrent writes end up on the same erase block. Rename is in effect atomic. Just no one will guarantee that, because in some strange case it might end up the writes happen across two EBs.
My suggestion is copy the new bootloader with some temp name. Sync. Rename. And unmount the volume. There's no good reason to keep this thing mounted persistently all the time anyway.
Anaconda new clean installs to completely empty media results in a 600MiB EFI system partitions since I think Fedora 32. I have a laptop with a ~1 year old clean installed Windows 10 Pro using Microsoft produced media and their installer - not the OEM. And it's a 100MiB ESP. Ample free space is needed on the ESP for firmware update payloads, regardless of what platform does the update.
bootupd is now in FCOS. Any next steps for this ticket?
It's still in preview but yeah, I think we can close this as "MVP done".
For anyone curious with current stable the steps are:
$ systemctl enable bootupd.socket
$ env BOOTUPD_ACCEPT_PREVIEW=1 bootupctl update
The preview requirement was dropped in 0.2.0.
(But for FCOS...Fedora still hasn't shipped a shim update for boot hole so there's not a really strong reason to update your bootloader today)