coreos/rpm-ostree

High memory usage when updating

gdonval opened this issue · 5 comments

Host system details

Provide the output of rpm-ostree status.

State: idle
AutomaticUpdatesDriver: Zincati
  DriverState: active; periodically polling for updates (last checked Mon 2023-11-06 19:58:54 UTC)
Deployments:
● fedora:fedora/x86_64/coreos/stable
                  Version: 38.20231014.3.0 (2023-10-30T15:27:11Z)
               BaseCommit: 2d12493e6479cced115b676c684c0ba1d9b29d69164a5faf0134508aa5ef5330
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464
          LayeredPackages: tailscale

Expected vs actual behavior

# rpm-ostree update
2 metadata, 0 content objects fetched; 788 B transferred in 2 seconds; 0 bytes content written
Checking out tree 2d12493... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora tailscale-stable updates-archive
Updating metadata for 'fedora-cisco-openh264'... done
Updating metadata for 'fedora-modular'... done
Updating metadata for 'updates-modular'... done
Updating metadata for 'updates'... done
Updating metadata for 'fedora'... done
Updating metadata for 'tailscale-stable'... done
Updating metadata for 'updates-archive'... done
Importing rpm-md... done
error: Bus owner changed, aborting. This likely means the daemon crashed; check logs with `journalctl -xe`.

High memory usage (much more that before last update date:

image

Expected:

# rpm-ostree update
...
Run "systemctl reboot" to start a reboot

image

Steps to reproduce it

  1. Update on a system with a bit less than 1GB of RAM.
  2. See it fail.

Non-DNF systems really don't have that problem. It used to work well. AFAIK we don't have 800MB of metadata we need to have in memory all the time. rpm-ostree applies transactional updates, starting from a pre-resolved base image: there is no or little dependency resolution necessary.

Would you like to work on the issue?

No.

Yeah, dnf actually hits this too: https://bugzilla.redhat.com/show_bug.cgi?id=1907030.

rpm-ostree applies transactional updates, starting from a pre-resolved base image: there is no or little dependency resolution necessary.

Not exactly. :) As soon as you use layering, dependency resolution must happen client-side (we need to know what other packages to pull in) and so rpm-ostree has to download repodata just like dnf would (and in fact uses the same stack, hence why it's hitting this same issue). Without layering, updates are purely OSTree-based.

That said, in theory we could have a layering mode where we don't do any libsolv dependency resolution because all the required packages are explicitly provided and librpm can confirm that the deps are satisfied, but that's not the common case.

You can also switch to layering and avoid paying the rpm penalty per machine.

As soon as you use layering

Oh hang on: if there is no layering there is no libsolve involvement? That changes quite a few things. Thanks for pointing that out!


could have a layering mode where we don't do any libsolv

I wish CoreOS had a container layering mode that could be used just like toolbox (but using the actual booted OS as base layer instead of downloading a Workstation image). No manual maintenance, updates alongside the base system, that would also make that not so common case you described pretty common since the starting point of the delta would be the raw ostree image.

You can also switch to layering...

Thanks for the suggestion! I'll have to try both methods (that, and using a tailscale container). I hope the complexity will stay low because in spite of the problem at hand, layering base packages like this has been a bliss.

Oh hang on: if there is no layering there is no libsolve involvement? That changes quite a few things. Thanks for pointing that out!

Yes. It's been that way since the creation of rpm-ostree. If no rpms are layered, content is fetched via ostree which does no depsolving, just replicates a filesystem tree efficiently.

Now part of the idea of github.com/containers/bootc is to be a new container-native frontend that very clearly does not involve any rpm at all for the container flow.

In any case sorry this is not an rpm-ostree issue, it's a Fedora issue per https://bugzilla.redhat.com/show_bug.cgi?id=1907030

(Note e.g. CentOS has many many fewer packages and hence uses much less memory)