nullpo-head/wsl-distrod

[Bug]: `systemctl status` failed after upgrading systemd to version 250

winderica opened this issue · 11 comments

Describe the bug

After upgrading the version of systemd to 250, systemctl status doesn't work in Distrod and shows Failed to dump process list for 'PC', ignoring: Input/output error.

Steps to reproduce

  1. Upgrade systemd to version 250.
    ❯  systemctl --version
    systemd 250 (250.1-1-arch)
    +PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified
    
  2. Run systemctl status, and the error is as below.
    ❯ sudo systemctl status
    Failed to dump process list for 'PC', ignoring: Input/output error
    ● PC
        State: running
         Jobs: 0 queued
       Failed: 0 units
        Since: Fri 2022-01-07 00:21:11 CST; 41min ago
       CGroup: /
    
  3. Downgrade systemd to 249 and it works again.
❯  systemctl --version
systemd 249 (249rc3-2-arch)
+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified
❯ sudo systemctl status
● PC
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Fri 2022-01-07 00:21:11 CST; 41min ago
   CGroup: /
           ├─init.scope
           └─system.slice
             ├─systemd-networkd.service
             │ └─922 /usr/lib/systemd/systemd-networkd
             ├─systemd-udevd.service
             ├─systemd-homed.service
             │ └─129 /usr/lib/systemd/systemd-homed
             ├─cronie.service
             │ └─116 /usr/bin/crond -n
             ├─docker.service …
             │ ├─125 /usr/bin/dockerd -H fd://
             │ └─209 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
             ├─systemd-journald.service
             ├─sshd.service
             │ └─128 sshd: /usr/bin/sshd -D [listener] 0 of 10-100 startups
             ├─systemd-userdbd.service
             │ ├─1062 /usr/lib/systemd/systemd-userdbd
             │ ├─1178 systemd-userwork
             │ ├─1179 systemd-userwork
             │ └─1180 systemd-userwork

Expected behavior

systemctl status shouldn't fail. Actually, version 250 works fine on my another Arch Linux machine.

Windows version

Windows 11 (Build 22523.1000)

Linux kernel version

Linux PC 5.10.74.3-microsoft-standard-WSL2 #1 SMP Mon Oct 18 19:27:44 UTC 2021 x86_64 GNU/Linux

Distro

Arch Linux

How did you install that distro?

Installed by Distrod wizard

Logs

[Distrod][DEBUG] distrod-exec: exec_command_in_distro
[Distrod][DEBUG] starting /init from distrod-exec
[Distrod][DEBUG] WSL envs: "WSLENV" = "WT_SESSION::WT_PROFILE_ID"
[Distrod][DEBUG] WSL envs: "WSL_DISTRO_NAME" = "Distrod"
[Distrod][DEBUG] WSL envs: "WSL_INTEROP" = "/run/WSL/8_interop"
[Distrod][DEBUG] Container::with_mount source: Some(HostPath("/run/distrod/cmdline")), target: ContainerPath("/proc/cmdline"), fstype: None, flags: MS_BIND, is_file: true 
[Distrod][TRACE] mount_distrod_run_files: path: "/opt/distrod/run/systemd"
[Distrod][TRACE] mount_distrod_run_files: path: "/opt/distrod/run/systemd/system"
[Distrod][TRACE] mount_distrod_run_files: path: "/opt/distrod/run/systemd/system/portproxy.service"
[Distrod][DEBUG] Container::with_mount source: Some(HostPath("/opt/distrod/run/systemd/system/portproxy.service")), target: ContainerPath("/run/systemd/system/portproxy.service"), fstype: None, flags: MS_BIND, is_file: true
[Distrod][TRACE] mount_distrod_run_files: path: "/opt/distrod/run/tmpfiles.d"
[Distrod][TRACE] mount_distrod_run_files: path: "/opt/distrod/run/tmpfiles.d/x11.conf"
[Distrod][DEBUG] Container::with_mount source: Some(HostPath("/opt/distrod/run/tmpfiles.d/x11.conf")), target: ContainerPath("/run/tmpfiles.d/x11.conf"), fstype: None, flags: MS_BIND, is_file: true
[Distrod][DEBUG] DistroLauncher::launch
[Distrod][DEBUG] Container::with_mount source: Some(HostPath("/run/distrod/distrod_wsl_env-uid1000")), target: ContainerPath("/run/distrod/distrod_wsl_env-uid1000"), fstype: None, flags: MS_BIND, is_file: true
[Distrod][DEBUG] Spawning the command or the waiter.
[Distrod][DEBUG] Executing a command in the distro.
[Distrod][DEBUG] Failed to ignore signal Sys(EINVAL)
[Distrod][DEBUG] Failed to ignore signal Sys(EINVAL)
[Distrod][DEBUG] Distro::exec_command.
[Distrod][DEBUG] Container::exec_command.
[Distrod][TRACE] mounting source: Some(
    ContainerPath(
        "/run/distrod/cmdline",
    ),
), mount: ContainerMount { source: Some(HostPath("/run/distrod/cmdline")), target: ContainerPath("/proc/cmdline"), fstype: None, flags: MS_BIND, data: None, is_file: true }
[Distrod][DEBUG] dropping privilege. kmsg logging in the child ends here.
[Distrod][DEBUG] Triple fork done.
[Distrod][TRACE] mounting source: Some(
    ContainerPath(
        "/opt/distrod/run/systemd/system/portproxy.service",
    ),
), mount: ContainerMount { source: Some(HostPath("/opt/distrod/run/systemd/system/portproxy.service")), target: ContainerPath("/run/systemd/system/portproxy.service"), fstype: None, flags: MS_BIND, data: None, is_file: true }
[Distrod][TRACE] mounting source: Some(
    ContainerPath(
        "/opt/distrod/run/tmpfiles.d/x11.conf",
    ),
), mount: ContainerMount { source: Some(HostPath("/opt/distrod/run/tmpfiles.d/x11.conf")), target: ContainerPath("/run/tmpfiles.d/x11.conf"), fstype: None, flags: MS_BIND, data: None, is_file: true }
[Distrod][DEBUG] The parent of the second of three forks exits.
[Distrod][TRACE] skipping an identical mount: Some(
    ContainerPath(
        "/run/distrod/distrod_wsl_env-uid1000",
    ),
), ContainerMount {
    source: Some(
        HostPath(
            "/run/distrod/distrod_wsl_env-uid1000",
        ),
    ),
    target: ContainerPath(
        "/run/distrod/distrod_wsl_env-uid1000",
    ),
    fstype: None,
    flags: MS_BIND,
    data: None,
    is_file: true,
}
[Distrod][DEBUG] Spawning the command or the waiter.
[Distrod][DEBUG] Spawning the waiter.
[Distrod][DEBUG] Failed to ignore signal Sys(EINVAL)
[Distrod][DEBUG] Failed to ignore signal Sys(EINVAL)

additional comment

No response

Thanks for reporting! I'll look into it when I have time. (I'm busy right now due to house moving...

I'm experiencing the same issue. To add, systemctl --user status outputs

Failed to connect to bus: No such file or directory
jueti commented

Same issue.

Test fine when downgrading systemd version to systemd 249 (249.7-2-arch).

I got a conflict with hwids and hwdata.
I just easily delete conflicting files:

$ cp /usr/share/hwdata/pci.ids .
$ cp /usr/share/hwdata/pnp.ids .
$ cp /usr/share/hwdata/usb.ids .
$ sudo rm /usr/share/hwdata/pci.ids /usr/share/hwdata/pnp.ids /usr/share/hwdata/usb.ids

Now, I am success for downgrading:

$ sudo downgrade hwids
loading packages...
resolving dependencies...
looking for conflicting packages...

Packages (1) hwids-20201207-1

Total Installed Size:  1.91 MiB

:: Proceed with installation? [Y/n] y
(1/1) checking keys in keyring                                           [########################################] 100%
(1/1) checking package integrity                                         [########################################] 100%
(1/1) loading package files                                              [########################################] 100%
(1/1) checking for file conflicts                                        [########################################] 100%
(1/1) checking available disk space                                      [########################################] 100%
:: Processing package changes...
(1/1) installing hwids                                                   [########################################] 100%
:: Running post-transaction hooks...
(1/1) Arming ConditionNeedsUpdate...
add hwids to IgnorePkg? [y/N] y
$ sudo downgrade systemd
loading packages...
warning: downgrading package systemd (250.3-2 => 249.7-2)
resolving dependencies...
looking for conflicting packages...

Packages (1) systemd-249.7-2


Total Installed Size:  25.80 MiB
Net Upgrade Size:      -2.60 MiB

:: Proceed with installation? [Y/n] y
(1/1) checking keys in keyring                                           [########################################] 100%
(1/1) checking package integrity                                         [########################################] 100%
(1/1) loading package files                                              [########################################] 100%
(1/1) checking for file conflicts                                        [########################################] 100%
(1/1) checking available disk space                                      [########################################] 100%
warning: could not get file information for usr/lib/systemd/system/system-systemd\x2dcryptsetup.slice
:: Processing package changes...
(1/1) downgrading systemd                                                [########################################] 100%
:: Running post-transaction hooks...
(1/9) Creating system user accounts...
(2/9) Updating journal message catalog...
(3/9) Reloading system manager configuration...
(4/9) Updating udev hardware database...
(5/9) Applying kernel sysctl settings...
Couldn't write '16' to 'kernel/sysrq', ignoring: No such file or directory
(6/9) Creating temporary files...
(7/9) Reloading device manager configuration...
(8/9) Arming ConditionNeedsUpdate...
(9/9) Reloading system bus configuration...

systemd 249 (249.7-2-arch) works well.

$ systemctl --version
systemd 249 (249.7-2-arch)
+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified

$ systemctl status
● UD3
    State: running
     Jobs: 1 queued
   Failed: 0 units
    Since: Sat 2022-02-05 11:49:42 CST; 15min ago
   CGroup: /
           ├─init.scope
           │ └─1 /usr/lib/systemd/systemd systemd.setenv=WSL_INTEROP=/run/WSL/8_interop systemd.setenv=WSLENV=WT_SESSIO>
           └─system.slice
             ├─systemd-udevd.service
             │ └─41 /usr/lib/systemd/systemd-udevd
             ├─systemd-journald.service
             │ └─31 /usr/lib/systemd/systemd-journald
             ├─dbus.service
             │ └─59 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog>
             ├─systemd-tmpfiles-clean.service
             │ └─2569 systemd-tmpfiles --clean
             └─systemd-logind.service
               └─60 /usr/lib/systemd/systemd-logind
[jason@UD3 18200]$

May be:

systemd 250-1.2 -> 250-1.1 - "hwids" was replaced by "hwdata"

This may be this:

systemd/systemd#22089

I am also encountering this on systemd/systemd@v250 -

$ systemctl --version; systemctl status
systemd 250 (250.4-2-arch)
+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified
Failed to dump process list for 'Adair', ignoring: Input/output error
● Adair
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Sat 2022-03-26 14:20:25 AEST; 29min ago
   CGroup: /

Hi, thanks for the reports, all. I was busy recently but now started to work on distrod fixes again. Actually, I already figured out the root cause and am working on fix. The bug happened only on weird Linux set up (yes, it's WSL), so I'm doing fix from distrod side; making a cgroup namespace. Please wait for a while!

This may be this:

systemd/systemd#22089

It still does not work after upgrading systemd to version 251. FYI:

❯ systemctl --version
systemd 251 (251-1-arch)
+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified

❯ sudo systemctl status
Failed to dump process list for 'PC', ignoring: Input/output error
● PC
    State: running
    Units: 245 loaded (incl. loaded aliases)
     Jobs: 0 queued
   Failed: 0 units
    Since: Mon 2022-05-23 11:39:33 CST; 5h 6min ago
  systemd: 251-1-arch
  Tainted: cgroupsv1
   CGroup: /

This may be this:

systemd/systemd#22089

Tried that patch on top of 250.4 on NixOS and also didn't work.

strace

ioctl(1, TIOCGWINSZ, {ws_row=43, ws_col=190, ws_xpixel=0, ws_ypixel=0}) = 0
sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\4\1\f\0\0\0\3\0\0\0\247\0\0\0\1\1o\0\31\0\0\0/org/freedesktop/systemd1\0\0\0\0\0\0\0\3\1s\0\20\0\0\0GetUnitProcesses\0\0\0\0\0\0\0\0\2\1s\0 \0\0\0org.freedesktop.systemd1.Manager\0\0\0\0\0\0\0\0\6\1s\0\30\0\0\0org.freedesktop.systemd1\0\0\0\0\0\0\0\0\10\1g\0\1s\0\0", iov_len=184}, {iov_base="\7\0\0\0-.slice\0", iov_len=12}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 196
clock_gettime(CLOCK_MONOTONIC, {tv_sec=207, tv_nsec=180553963}) = 0
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\3\1\1\27\0\0\0O\1\0\0]\0\0\0\5\1u\0\3\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\5\0\0\0:1.17\0\0\0\4\1s\0\"\0\0\0org.freedesktop.DBus.Error.IOError\0\0\0\0\0\0\10\1g\0\1s\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\22\0\0\0Input/output error\0", iov_len=111}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 111
writev(2, [{iov_base="Failed to dump process list for 'W-PF2ZA0JC', ignoring: Input/output error", iov_len=74}, {iov_base="\n", iov_len=1}], 2Failed to dump process list for 'W-PF2ZA0JC', ignoring: Input/output error

I made a bit of progress debugging this issue. I followed this post, executing sudo loginctl enable-linger $USER.

Then, I read this post, and noticed that DBUS_SESSION_BUS_ADDRESS is not set too. Set it using export DBUS_SESSION_BUS_ADDRESS=/run/user/$(id -u $USER)/bus

After that, systemctl should run "normally".

image

Edit: Removed OOT part. You can always see the history 😉

It may be worthy of note that this same failure happens when using Microsoft's new built-in systemd support.