coreos/rpm-ostree

rpm-ostree crash installing layered packaged

jflingevitch opened this issue · 8 comments

Host system details

Provide the output of rpm-ostree status.

rpm-ostree status
State: idle
Deployments:
● fedora:fedora/37/x86_64/silverblue
Version: 37.20230127.0 (2023-01-27T01:05:12Z)
Commit: 4a010ebbfd774433becd127f6b47032d8804f637aae13f8f6b702583209a1c33
GPGSignature: Valid signature by ACB5EE4E831C74BB7C168D27F55AD3FB5323552A

Expected vs actual behavior


# rpm-ostree install thunderbird
rpm-ostree install thunderbird
Checking out tree 4a010eb... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora rpmfusion-nonfree-nvidia-driver google-chrome updates-archive
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-10-06T11:01:40Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-11-10T09:23:24Z solvables: 1454
rpm-md repo 'updates-modular' (cached); generated: 2023-01-03T01:27:52Z solvables: 1464
rpm-md repo 'updates' (cached); generated: 2023-01-27T08:43:57Z solvables: 16939
rpm-md repo 'fedora' (cached); generated: 2022-11-10T09:30:00Z solvables: 66822
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2023-01-24T11:25:29Z solvables: 30
rpm-md repo 'google-chrome' (cached); generated: 2023-01-25T14:32:24Z solvables: 3
rpm-md repo 'updates-archive' (cached); generated: 2023-01-27T06:53:17Z solvables: 20422
Resolving dependencies... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
error: Bus owner changed, aborting. This likely means the daemon crashed; check logs with `journalctl -xe`.

Expected:

# rpm-ostree install thunderbird
...
Success!

Steps to reproduce it
Try to install a layered package on silverblue 37 after recent rpm-ostree update

rpm -qa|grep -i rpm-ostree
rpm-ostree-libs-2023.1-1.fc37.x86_64
rpm-ostree-2023.1-1.fc37.x86_64
gnome-software-rpm-ostree-43.3-1.fc37.x86_64

systemctl status rpm-ostreed
× rpm-ostreed.service - rpm-ostree System Management Daemon
Loaded: loaded (/usr/lib/systemd/system/rpm-ostreed.service; static)
Active: failed (Result: core-dump) since Fri 2023-01-27 06:01:59 EST; 3s ago
Duration: 4.739s
Docs: man:rpm-ostree(1)
Process: 6428 ExecStart=rpm-ostree start-daemon (code=dumped, signal=ABRT)
Main PID: 6428 (code=dumped, signal=ABRT)
Status: "clients=1; txn=PkgChange caller=:1.262 path=/org/projectatomic/rpmostree1/fedora"
CPU: 4.112s

Jan 27 06:01:54 fedora rpm-ostree[6428]: Locked sysroot
Jan 27 06:01:54 fedora rpm-ostree[6428]: Initiated txn PkgChange for client(id:cli dbus:1.262 unit:vte-spawn-1502117f-b62d-497d-baf>
Jan 27 06:01:54 fedora rpm-ostree[6428]: Process [pid: 6419 uid: 1001 unit: user@1001.service] connected to transaction progress
Jan 27 06:01:54 fedora rpm-ostree[6428]: Librepo version: 1.15.1 with CURL_GLOBAL_ACK_EINTR support (libcurl/7.85.0 OpenSSL/3.0.5 z>
Jan 27 06:01:58 fedora rpm-ostree[6428]: Preparing pkg txn; enabled repos: ['fedora-cisco-openh264', 'fedora-modular', 'updates-mod>
Jan 27 06:01:58 fedora rpm-ostree[6428]: thread '' panicked at 'assertion failed: tv_nsec >= 0 && tv_nsec < NSEC_PER_SEC a>
Jan 27 06:01:58 fedora rpm-ostree[6428]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
Jan 27 06:01:59 fedora systemd[1]: rpm-ostreed.service: Main process exited, code=dumped, status=6/ABRT
Jan 27 06:01:59 fedora systemd[1]: rpm-ostreed.service: Failed with result 'core-dump'.
Jan 27 06:01:59 fedora systemd[1]: rpm-ostreed.service: Consumed 4.112s CPU time.

Provide any additional data that may help debug this - which specific version of
an RPM is in the repo, or any host system configuration.

Would you like to work on the issue?

Please let us know if you can work on it or the issue should be assigned to
someone else.
I cannot work on this issue.

In fedora silverblue 37 there was a version bump in the rpm-ostree package at:

commit 5168f110ed63e77b7f1f8fe2b67531d87c340a2b6935072e11e05528f3196847
Parent: 9eba0f2d2ce05fc9a34f7a94ac162ce76f9e720d7f2c78205aa280970fea56d6
ContentChecksum: 31a99de13e282afeff25c276c197f162cff9d73b769fdbdf9f8c0bad95a0c390
Date: 2023-01-25 00:44:48 +0000
Version: 37.20230125.0

rpm-ostree db diff 99a409a04f7249f49224be6b07f45c082ed82ffa83d75c2eaddc00d66a999a4c 5168f110ed63e77b7f1f8fe2b67531d87c340a2b6935072e11e05528f3196847
.....
rpm-ostree 2022.19-2.fc37 -> 2023.1-1.fc37
rpm-ostree-libs 2022.19-2.fc37 -> 2023.1-1.fc37

Can you add a systemd unit override for rpm-ostreed.service with something like:

[Service]
Environment=RUST_BACKTRACE=1

and try again to see if we can get a backtrace?

Attached two output files here:

Looks like it is missing debugging symbols??

journalctl -u rpm-ostreed.service -b
journalctl_out.txt

sudo coredumpctl info > coredumpinfo.txt
coredumpinfo.txt

This might be from a Rust dependency update. If you have the actual coredump handy, could you upload that somewhere? We can dig into it with full debuginfo.

Still having this issue on one machine, but another machine with similar setup is fine. If you have any suggestions on troubleshooting, I would appreciate it. Thanks.

Had to split the coredump into 2 pieces with (and then gzipped them)
split -n 2 rpm-ostree.coredump my_coredump_

my_coredump_aa.gz

my_coredump_ab.gz
silverblue37_info_30Jan2023.txt

I faced this issue while testing the infra osbuild Ansible Collection to generate new RHEL OSTree images.

In my case, after including a new ostree repository in the HTTP server, rpm-ostree upgrade --check and rpm-ostree upgrade --preview are not working but I can rpm-ostree upgrade and since then, I can again use check and preview.

I've included information about it, including captures and how to reproduce it here:

https://github.com/luisarizmendi/bugs

Here is my latest attempt to try to get a backtrace of this crash. Still haven't solved it, but maybe this contains some clues as to what is going wrong.

backtrace.log

The triggering assertion is here https://docs.rs/time/0.1.44/src/time/lib.rs.html#89

Something is creating an invalid time; unfortunately there's a lot of places that could do that. I haven't been able to get a useful backtrace from the core dumps so far.

I've included information about it, including captures and how to reproduce it here:
https://github.com/luisarizmendi/bugs

Ah...wow! Nice job recording all this, though that's quite a bit of setup. I think though what we need to do is:

  • Add more debug logging
  • Get you a build with that logging enabled