Eraden/amdgpud

Service doesn't start correctly after fresh PC start

r15ch13 opened this issue · 15 comments

The service doesn't run correctly if the PC starts. It seems that the card is not ready and can't be found. Maybe adding a second call to controllers(&config, true)?; in service() could help.

I think we should sleep for configurable amount of time and retry one or two times then exit with some error. WDYT?

Sounds good to me. 👍🏽 😄

I think we should sleep for configurable amount of time and retry one or two times then exit with some error. WDYT?

Either that or what would be more elegant in my opinion: Set up the systemd service in such a way that it only tries to start once all the devices are available. But I don't know if/how that is possible. If it is possible then I think it should work with the Requires=-field of the service.

Or let the service fail when it hasn't found a card and let systemd restart it?

Yes, that is also a good idea. Should be possible with something like

Restart=on-failure
RestartSec=5s

We also support OpenRC. So if this OpenRC restarts service and can wait then I'm for fixing service files

Does this issue even exist for OpenRC-users? Maybe it's only systemd who is trying to start this program too early in the boot process.

For testing I have added the following to service() and changed the service to @BoostCookie's suggestion.

    if controllers.len() == 0 {
        log::error!("No card found!");
        return Err(std::io::Error::from(ErrorKind::NotFound))
    }

Will try it now. 😄

Yup, that works.

Also, here is the dmesg output which proofs that the service is loaded before the graphics card is online. (I had amdfand.service set to 755 accidentally, which showed a warning. 😅)

[    3.346704] systemd[1]: Detected architecture x86-64.
[    3.347199] systemd[1]: Hostname set to <Bluescreen>.
[    3.425349] random: lvmconfig: uninitialized urandom read (4 bytes read)
[    3.463346] usb 3-1: new high-speed USB device number 2 using xhci_hcd

// amdfand is loaded

[    3.470814] systemd[1]: Configuration file /usr/lib/systemd/system/amdfand.service is marked executable. Please remove executable permission bits. Proceeding anyway.
[    3.471698] systemd[1]: Queued start job for default target Graphical Interface.
[    3.510764] systemd[1]: Created slice Virtual Machine and Container Slice.
[    3.511082] systemd[1]: Created slice Slice /system/getty.
[    3.511267] systemd[1]: Created slice Slice /system/modprobe.
[    3.511427] systemd[1]: Created slice Slice /system/systemd-fsck.
[    3.511574] systemd[1]: Created slice User and Session Slice.
[    3.511612] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[    3.511639] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    3.511725] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
// more systemd stuff
[    3.579849] audit: type=1400 audit(1627502350.757:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=353 comm="apparmor_parser"
[    3.580489] audit: type=1400 audit(1627502350.760:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=353 comm="apparmor_parser"
[    3.581879] audit: type=1400 audit(1627502350.760:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=355 comm="apparmor_parser"
[    3.585699] systemd[1]: Started Rule-based Manager for Device Events and Files.
[    3.585733] audit: type=1130 audit(1627502350.764:5): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-udevd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    3.590987] systemd[1]: Finished Coldplug All udev Devices.
[    3.591028] audit: type=1130 audit(1627502350.770:6): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-udev-trigger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    3.609318] acpi_cpufreq: overriding BIOS provided _PSD data
[    3.609596] systemd[1]: Started Journal Service.
[    3.609829] audit: type=1130 audit(1627502350.787:7): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    3.611927] usb 3-1: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice= 6.55
[    3.611929] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    3.611931] usb 3-1: Product: USB2.1 Hub
[    3.611931] usb 3-1: Manufacturer: GenesysLogic
[    3.614892] systemd-journald[361]: Received client request to flush runtime journal.
[    3.630198] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not cover the entire command/response buffer. [mem 0xbd87b000-0xbd87bfff flags 0x200] vs bd87b000 4000
[    3.630201] tpm_crb MSFT0101:00: can't request region for resource [mem 0xbd87b000-0xbd87bfff]
[    3.630207] tpm_crb: probe of MSFT0101:00 failed with error -16
[    3.654986] random: mktemp: uninitialized urandom read (6 bytes read)
[    3.656370] audit: type=1400 audit(1627502350.834:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="dovecot-script-login" pid=356 comm="apparmor_parser"
[    3.664218] hub 3-1:1.0: USB hub found
[    3.664801] hub 3-1:1.0: 4 ports detected
[    3.667725] audit: type=1400 audit(1627502350.847:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="php-fpm" pid=354 comm="apparmor_parser"
[    3.671450] audit: type=1130 audit(1627502350.850:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    3.704244] RAPL PMU: API unit is 2^-32 Joules, 1 fixed counters, 163840 ms ovfl timer
[    3.704246] RAPL PMU: hw unit of domain package 2^-16 Joules
[    3.722759] random: crng init done
[    3.722760] random: 1 urandom warning(s) missed due to ratelimiting
[    3.763822] input: PC Speaker as /devices/platform/pcspkr/input/input2
[    3.764052] dca service started, version 1.12.1
[    3.764131] piix4_smbus 0000:00:14.0: SMBus Host Controller at 0xb00, revision 0
[    3.764135] piix4_smbus 0000:00:14.0: Using register 0x02 for SMBus port selection
[    3.764235] piix4_smbus 0000:00:14.0: Auxiliary SMBus Host Controller at 0xb20
[    3.765314] ccp 0000:20:00.2: enabling device (0000 -> 0002)
[    3.765772] ccp 0000:20:00.2: ccp enabled
[    3.767425] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
[    3.767498] sp5100-tco sp5100-tco: Using 0xfeb00000 for watchdog MMIO address
[    3.767712] cryptd: max_cpu_qlen set to 1000
[    3.767875] sp5100-tco sp5100-tco: initialized. heartbeat=60 sec (nowayout=0)
[    3.777700] AVX2 version of gcm_enc/dec engaged.
[    3.777701] AES CTR mode by8 optimization enabled
[    3.780409] igb: Intel(R) Gigabit Ethernet Network Driver
[    3.780411] igb: Copyright (c) 2007-2014 Intel Corporation.
[    3.796715] usb 4-4: Device not responding to setup address.
[    3.809603] pps pps0: new PPS source ptp0
[    3.809646] igb 0000:18:00.0: added PHC on eth0
[    3.809647] igb 0000:18:00.0: Intel(R) Gigabit Ethernet Network Connection
[    3.809649] igb 0000:18:00.0: eth0: (PCIe:2.5Gb/s:Width x1) 00:d8:61:2a:f2:93
[    3.809651] igb 0000:18:00.0: eth0: PBA No: FFFFFF-0FF
[    3.809652] igb 0000:18:00.0: Using MSI-X interrupts. 2 rx queue(s), 2 tx queue(s)
[    3.892917] kvm: Nested Virtualization enabled
[    3.892927] SVM: kvm: Nested Paging enabled
[    3.892928] SVM: Virtual VMLOAD VMSAVE supported
[    3.892928] SVM: Virtual GIF supported
[    3.896097] MCE: In-kernel MCE decoding enabled.
[    3.899036] EDAC amd64: F17h detected (node 0).
[    3.899076] EDAC amd64: Node 0: DRAM ECC disabled.
[    3.904634] usb 1-12: New USB device found, idVendor=04d9, idProduct=0169, bcdDevice= 1.10
[    3.904636] usb 1-12: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[    3.904637] usb 1-12: Product: USB Keyboard
[    3.915047] snd_hda_intel 0000:1e:00.1: Force to non-snoop mode
[    3.915469] snd_hda_intel 0000:1f:00.1: enabling device (0000 -> 0002)
[    3.915501] snd_hda_intel 0000:1f:00.1: Disabling MSI
[    3.915505] snd_hda_intel 0000:1f:00.1: Handle vga_switcheroo audio client
[    3.915555] snd_hda_intel 0000:21:00.3: enabling device (0000 -> 0002)
[    4.003362] usb 4-4: Device not responding to setup address.
[    4.030598] EDAC amd64: F17h detected (node 0).
[    4.030638] EDAC amd64: Node 0: DRAM ECC disabled.
[    4.040020] usb 1-13: new full-speed USB device number 3 using xhci_hcd

// first time something mentions ATI

[    4.045666] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input3
[    4.045727] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input4
[    4.045797] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input5
[    4.045835] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input9
[    4.045865] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input6
[    4.045912] input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input7
[    4.045963] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input10
[    4.046005] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input11
[    4.046052] input: HDA ATI HDMI HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input8
[    4.046174] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input12
[    4.046218] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input13
[    4.046262] input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input14
[    4.046307] input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input15
[    4.082064] nct6775: Found NCT6795D or compatible chip at 0x4e:0xa20
[    4.082182] MXM: GUID detected in BIOS
[    4.082227] nouveau 0000:1f:00.0: enabling device (0000 -> 0003)
[    4.082358] checking generic (e0000000 300000) vs hw (f6000000 1000000)
[    4.082359] checking generic (e0000000 300000) vs hw (c0000000 10000000)
[    4.082359] checking generic (e0000000 300000) vs hw (d0000000 2000000)
[    4.082395] nouveau 0000:1f:00.0: NVIDIA GP104 (134000a1)
[    4.144046] EDAC amd64: F17h detected (node 0).
[    4.144082] EDAC amd64: Node 0: DRAM ECC disabled.

Please confirm if changing Requires fixed issue. If not reopen issue

CC: @BoostCookie @r15ch13

Tested it with #12 and with the following code but it didn't work. See journalctl output below.

image

image
(Note: I just edited the service file and run systemctl daemon-reload, so I didn't get the error message mentioned in #10 (comment) but I can reproduce it with systemctl enable amdfand)

// amdfand started
Jul 29 18:06:51 Bluescreen systemd[1]: Started amdfan controller.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=amdfand comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Avahi mDNS/DNS-SD Stack...
Jul 29 18:06:51 Bluescreen systemd[1]: Started Periodic Command Scheduler.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=cronie comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Started D-Bus System Message Bus.
Jul 29 18:06:51 Bluescreen crond[487]: (CRON) STARTUP (1.5.7)
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=dbus comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Network Manager...
Jul 29 18:06:51 Bluescreen crond[487]: (CRON) INFO (Syslog will be used instead of sendmail.)

// amdfand didnt' find a card
Jul 29 18:06:51 Bluescreen amdfand[485]: Cards found: 0
Jul 29 18:06:51 Bluescreen crond[487]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 7% if used.)
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Clean up modules from old kernels...
Jul 29 18:06:51 Bluescreen crond[487]: (CRON) INFO (running with inotify support)
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Initialize hardware monitoring sensors...
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Found user 'avahi' (UID 972) and group 'avahi' (GID 972).
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Authorization Manager...
Jul 29 18:06:51 Bluescreen linux-module-cleanup[494]: cat: /usr/lib/modules/.old: No such file or directory
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=10 op=LOAD
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=11 op=LOAD
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=12 op=LOAD
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Snap Daemon...
Jul 29 18:06:51 Bluescreen systemd[1]: Starting User Login Management...
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=13 op=LOAD
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=14 op=LOAD
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Virtual Machine and Container Registration Service...
Jul 29 18:06:51 Bluescreen systemd[1]: linux-module-cleanup.service: Deactivated successfully.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=linux-module-cleanup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=linux-module-cleanup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Successfully dropped root privileges.
Jul 29 18:06:51 Bluescreen systemd[1]: Finished Clean up modules from old kernels.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: avahi-daemon 0.8 starting up.
Jul 29 18:06:51 Bluescreen polkitd[492]: Started polkitd version 0.119
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.0837] NetworkManager (version 1.32.4-1) is starting... (for the first time)
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.0841] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 20-connectivity.conf)
Jul 29 18:06:51 Bluescreen systemd-logind[495]: Watching system buttons on /dev/input/event1 (Power Button)
Jul 29 18:06:51 Bluescreen systemd-logind[495]: Watching system buttons on /dev/input/event0 (Power Button)
Jul 29 18:06:51 Bluescreen systemd-logind[495]: New seat seat0.
Jul 29 18:06:51 Bluescreen systemd[1]: Started Virtual Machine and Container Registration Service.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-machined comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Started User Login Management.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-logind comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Started Avahi mDNS/DNS-SD Stack.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=avahi-daemon comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen snapd[493]: AppArmor status: apparmor is enabled but some kernel features are missing: dbus, network
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Successfully called chroot().
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Successfully dropped remaining capabilities.
Jul 29 18:06:51 Bluescreen systemd[1]: Started Network Manager.
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.0954] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: No service file found in /etc/avahi/services.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=NetworkManager comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Reached target Network.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Joining mDNS multicast group on interface lo.IPv6 with address ::1.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: New relevant interface lo.IPv6 for mDNS.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Joining mDNS multicast group on interface lo.IPv4 with address 127.0.0.1.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: New relevant interface lo.IPv4 for mDNS.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Network interface enumeration completed.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Registering new address record for ::1 on lo.*.
Jul 29 18:06:51 Bluescreen avahi-daemon[486]: Registering new address record for 127.0.0.1 on lo.IPv4.
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Network Manager Wait Online...
Jul 29 18:06:51 Bluescreen kernel: usb 1-12: New USB device found, idVendor=04d9, idProduct=0169, bcdDevice= 1.10
Jul 29 18:06:51 Bluescreen kernel: usb 1-12: New USB device strings: Mfr=0, Product=2, SerialNumber=0
Jul 29 18:06:51 Bluescreen kernel: usb 1-12: Product: USB Keyboard
Jul 29 18:06:51 Bluescreen systemd[1]: Starting containerd container runtime...
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.0979] manager[0x55b3998a7000]: monitoring kernel firmware directory '/lib/firmware'.
Jul 29 18:06:51 Bluescreen systemd[1]: Starting CUPS Scheduler...
Jul 29 18:06:51 Bluescreen dbus-daemon[488]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.5' (uid=0 pid=489 comm="/usr/bin/NetworkManager --no-daemon ")
Jul 29 18:06:51 Bluescreen systemd[1]: Started Virtualization daemon.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=libvirtd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Permit User Sessions...
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=15 op=LOAD
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=16 op=LOAD
Jul 29 18:06:51 Bluescreen audit: BPF prog-id=17 op=LOAD
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Hostname Service...
Jul 29 18:06:51 Bluescreen systemd[1]: Finished Permit User Sessions.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-user-sessions comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Started Simple Desktop Display Manager.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=sddm comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen snapd[493]: daemon.go:248: started snapd/2.51.3-2 (series 16; classic; devmode) manjaro/ (amd64) linux/5.10.53-1-MANJARO.
Jul 29 18:06:51 Bluescreen polkitd[492]: Loading rules from directory /etc/polkit-1/rules.d
Jul 29 18:06:51 Bluescreen polkitd[492]: Loading rules from directory /usr/share/polkit-1/rules.d
Jul 29 18:06:51 Bluescreen systemd[1]: Started CUPS Scheduler.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=cups comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen polkitd[492]: Finished loading, compiling and executing 8 rules
Jul 29 18:06:51 Bluescreen systemd[1]: Started Authorization Manager.
Jul 29 18:06:51 Bluescreen polkitd[492]: Acquired the name org.freedesktop.PolicyKit1 on the system bus
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=polkit comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen systemd[1]: Starting Modem Manager...
Jul 29 18:06:51 Bluescreen dbus-daemon[488]: [system] Successfully activated service 'org.freedesktop.hostname1'
Jul 29 18:06:51 Bluescreen systemd[1]: Started Hostname Service.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1279] hostname: hostname: using hostnamed
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1280] hostname: hostname changed from (none) to "Bluescreen"
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1282] dns-mgr[0x55b399862250]: init: dns=default,systemd-resolved rc-manager=symlink
Jul 29 18:06:51 Bluescreen sddm[534]: Initializing...
Jul 29 18:06:51 Bluescreen sddm[534]: Starting...
Jul 29 18:06:51 Bluescreen sddm[534]: Logind interface found
Jul 29 18:06:51 Bluescreen ModemManager[557]: <info>  ModemManager (version 1.16.8) starting in system bus...
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1497] manager[0x55b3998a7000]: rfkill: Wi-Fi hardware radio set enabled
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1498] manager[0x55b3998a7000]: rfkill: WWAN hardware radio set enabled
Jul 29 18:06:51 Bluescreen kernel: EDAC amd64: F17h detected (node 0).
Jul 29 18:06:51 Bluescreen kernel: EDAC amd64: Node 0: DRAM ECC disabled.
Jul 29 18:06:51 Bluescreen kernel: loop: module loaded
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1519] Loaded device plugin: NMAtmManager (/usr/lib/NetworkManager/1.32.4-1/libnm-device-plugin-adsl.so)
Jul 29 18:06:51 Bluescreen systemd[1]: Started Modem Manager.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=ModemManager comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen kernel: nct6775: Found NCT6795D or compatible chip at 0x4e:0xa20
// snip network
Jul 29 18:06:51 Bluescreen dbus-daemon[488]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jul 29 18:06:51 Bluescreen systemd[1]: Started Network Manager Script Dispatcher Service.
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.1685] modem-manager: ModemManager available
Jul 29 18:06:51 Bluescreen kernel: MXM: GUID detected in BIOS
Jul 29 18:06:51 Bluescreen kernel: nouveau 0000:1f:00.0: enabling device (0000 -> 0003)
Jul 29 18:06:51 Bluescreen kernel: checking generic (e0000000 300000) vs hw (f6000000 1000000)
Jul 29 18:06:51 Bluescreen kernel: checking generic (e0000000 300000) vs hw (c0000000 10000000)
Jul 29 18:06:51 Bluescreen kernel: checking generic (e0000000 300000) vs hw (d0000000 2000000)
Jul 29 18:06:51 Bluescreen kernel: nouveau 0000:1f:00.0: NVIDIA GP104 (134000a1)
Jul 29 18:06:51 Bluescreen kernel: snd_hda_intel 0000:1e:00.1: Force to non-snoop mode
Jul 29 18:06:51 Bluescreen kernel: snd_hda_intel 0000:1f:00.1: enabling device (0000 -> 0002)
Jul 29 18:06:51 Bluescreen kernel: snd_hda_intel 0000:1f:00.1: Disabling MSI
Jul 29 18:06:51 Bluescreen kernel: snd_hda_intel 0000:1f:00.1: Handle vga_switcheroo audio client
Jul 29 18:06:51 Bluescreen kernel: snd_hda_intel 0000:21:00.3: enabling device (0000 -> 0002)
Jul 29 18:06:51 Bluescreen containerd[576]: time="2021-07-29T18:06:51.207647125+02:00" level=info msg="starting containerd" revision=69107e47a62e1d690afa2b9b1d43f8ece3ff4483.m version=v1.5.4
// snip containerd
Jul 29 18:06:51 Bluescreen audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=containerd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 29 18:06:51 Bluescreen kernel: EDAC amd64: F17h detected (node 0).
Jul 29 18:06:51 Bluescreen kernel: EDAC amd64: Node 0: DRAM ECC disabled.
Jul 29 18:06:51 Bluescreen kernel: bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
Jul 29 18:06:51 Bluescreen NetworkManager[489]: <info>  [1627574811.2820] manager: (virbr0): new Bridge device (/org/freedesktop/NetworkManager/Devices/3)
Jul 29 18:06:51 Bluescreen kernel: squashfs: version 4.0 (2009/01/31) Phillip Lougher
Jul 29 18:06:51 Bluescreen kernel: nouveau 0000:1f:00.0: bios: version 86.04.85.00.9c
Jul 29 18:06:51 Bluescreen kernel: nouveau 0000:1f:00.0: pmu: firmware unavailable

// amd card loaded
Jul 29 18:06:51 Bluescreen kernel: input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input3
Jul 29 18:06:51 Bluescreen kernel: input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input4
Jul 29 18:06:51 Bluescreen kernel: input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input5
Jul 29 18:06:51 Bluescreen kernel: input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input6
Jul 29 18:06:51 Bluescreen kernel: input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input7
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input9
Jul 29 18:06:51 Bluescreen kernel: input: HDA ATI HDMI HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:03.1/0000:1c:00.0/0000:1d:00.0/0000:1e:00.1/sound/card0/input8
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input10
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input11
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input12
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input13
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input14
Jul 29 18:06:51 Bluescreen kernel: input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci0000:00/0000:00:03.2/0000:1f:00.1/sound/card1/input15
Jul 29 18:06:51 Bluescreen kernel: usb 1-13: new full-speed USB device number 3 using xhci_hcd
Jul 29 18:06:51 Bluescreen kernel: nouveau 0000:1f:00.0: fb: 8192 MiB GDDR5
Jul 29 18:06:51 Bluescreen snapd[493]: daemon.go:341: adjusting startup timeout by 30s (pessimistic estimate of 30s plus 5s per snap)
Jul 29 18:06:51 Bluescreen systemd[1]: Started Snap Daemon.

Using

[Unit]
Description=amdfan controller
[Service]
ExecStart=/usr/bin/amdfand service
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=default.target

and

    if controllers.len() == 0 {
        log::error!("No card found!");
        return Err(std::io::Error::from(ErrorKind::NotFound))
    }

works for me.

Please confirm if changing Requires fixed issue. If not reopen issue

Nope, I can't enable the new service:

% sudo systemctl enable amdfand.service
The unit files have no installation config (WantedBy=, RequiredBy=, Also=,
Alias= settings in the [Install] section, and DefaultInstance= for template
units). This means they are not meant to be enabled using systemctl.

Possible reasons for having this kind of units are:
• A unit may be statically enabled by being symlinked from another unit's
  .wants/ or .requires/ directory.
• A unit's purpose may be to act as a helper for some other unit which has
  a requirement dependency on it.
• A unit may be started when needed via activation (socket, path, timer,
  D-Bus, udev, scripted systemctl call, ...).
• In case of template units, the unit is meant to be enabled with some
  instance name specified.

I've now tried the following systemd-unit file and it works:

[Unit]
Description=amdfan controller
Requires=graphical.target
[Service]
ExecStart=/usr/local/bin/amdfand service
Restart=always
[Install]
WantedBy=multi-user.target

Changed and pushed

I'm sorry, the ExecStart was wrong. It is the path I was using for testing.
ExecStart=/usr/bin/amdfand service is the correct path.