nixpkgs `services.lorri` requires reboot for the user lorri.socket to appear
Profpatsch opened this issue · 11 comments
Describe the bug
When putting systemd.lorri.enable = true;
in your system config, you would assume that the socket unit appears immediately after a system switch, but in fact it only does after reboot. Logging out and into the user account does not help.
To Reproduce
- Remove any lorri from your system
- If you had the service active, remove from your config and reboot
- Add
lorri.service.enable = true;
and switch your system (at least nixos 20.03) systemctl --user
does not have any lorri service or socket- reboot
systemctl --user
has thelorri.socket
now
Expected behavior
The socket file should appear immediately after the NixOS system switch.
Additional context
Grepping through nixpkgs
for systemd.user.sockets
(and similar) only yields like three results, so it might be that starting a .socket
unit for the user daemon is just not a common use-case and we’re in uncharted territory.
Possible solutions I can think of:
- Can we set up lorri as system service instead of user service? Would there be cons?
- Document that a reboot is necessary. That’s meh
- Fix the
systemd.user
config to update on switch. Probably hard, might even be impossible.
cc @curiousleo
Doesn’t require reboot, just a reload of the user daemon:
systemctl --user daemon-reload
This is a regression in nixpkgs: NixOS/nixpkgs@b20a0e4#r38459106
@Profpatsch I'd like to ask a (possibly naive) question about this issue.
Why is the socket required at all? It seems like the weird interaction with systemd sockets is one of the major pain points for new users. As such, would it not be better to just have the very resource-light lorri daemon
command always be running? Most other user services e.g. on NixOS already seem to operate in this manner.
As such, would it not be better to just have the very resource-light
lorri daemon
command always be running? Most other user services e.g. on NixOS already seem to operate in this manner.
the .socket
file should just be a small improvement over the service file, true. Maybe we should remove it for less confusion.
However I’m fairly sure that the issue above pertains to the .service
file as well.
I have an issue I think is related where the lorri socket doesn't seem to reactivate after stopping it or reloading it:
[cody@nixos:~]$ systemctl --user daemon-reload
[cody@nixos:~]$ systemctl --user status lorri.service
● lorri.service - lorri build daemon
Loaded: loaded (/nix/store/8241pdh8fvrwg8ava3476rnb3050laqc-home-manager-f>
Active: inactive (dead) since Tue 2020-06-09 18:14:09 CDT; 7min ago
TriggeredBy: ● lorri.socket
Main PID: 2100 (code=killed, signal=TERM)
Jun 09 18:12:43 nixos systemd[1853]: Started lorri build daemon.
Jun 09 18:12:43 nixos lorri[2100]: Jun 09 18:12:43.338 INFO ready
Jun 09 18:13:24 nixos lorri[2100]: Jun 09 18:13:24.154 INFO build status, messa>
Jun 09 18:13:33 nixos lorri[2100]: Jun 09 18:13:33.297 INFO build status, messa>
Jun 09 18:13:41 nixos lorri[2100]: Jun 09 18:13:41.941 INFO build status, messa>
Jun 09 18:13:45 nixos lorri[2100]: Jun 09 18:13:45.933 INFO build status, messa>
Jun 09 18:14:09 nixos systemd[1853]: Stopping lorri build daemon...
Jun 09 18:14:09 nixos systemd[1853]: lorri.service: Succeeded.
Jun 09 18:14:09 nixos systemd[1853]: Stopped lorri build daemon.
[cody@nixos:~]$ cd smurf
direnv: loading ~/smurf/.envrc
Jun 09 18:22:01.771 INFO lorri daemon is not running, loading a cached environment, expr: /home/cody/smurf/shell.nix
direnv: export +AR +AS +CC +CONFIG_SHELL +CXX +HOST_PATH +IN_NIX_SHELL +LD +NIX_BINTOOLS +NIX_BINTOOLS_WRAPPER_x86_64_unknown_linux_gnu_TARGET_HOST +NIX_BUILD_CORES +NIX_BUILD_TOP +NIX_CC +NIX_CC_WRAPPER_x86_64_unknown_linux_gnu_TARGET_HOST +NIX_ENFORCE_NO_NATIVE +NIX_GHC +NIX_GHCPKG +NIX_GHC_DOCDIR +NIX_GHC_LIBDIR +NIX_HARDENING_ENABLE +NIX_INDENT_MAKE +NIX_LDFLAGS +NIX_LOG_FD +NIX_STORE +NM +OBJCOPY +OBJDUMP +RANLIB +READELF +SIZE +SOURCE_DATE_EPOCH +STRINGS +STRIP +allowSubstitutes +buildInputs +builder +configureFlags +depsBuildBuild +depsBuildBuildPropagated +depsBuildTarget +depsBuildTargetPropagated +depsHostHost +depsHostHostPropagated +depsTargetTarget +depsTargetTargetPropagated +doCheck +doInstallCheck +extraClosure +installPhase +name +nativeBuildInputs +origArgs +origBuilder +origExtraClosure +origOutputs +origPATH +origSystem +out +outputs +patches +phases +preHook +preferLocalBuild +propagatedBuildInputs +propagatedNativeBuildInputs +shell +shellHook +stdenv +strictDeps +system ~LOCALE_ARCHIVE ~PATH
[cody@nixos:~/smurf]$ systemctl --user status lorri.service
● lorri.service - lorri build daemon
Loaded: loaded (/nix/store/8241pdh8fvrwg8ava3476rnb3050laqc-home-manager-f>
Active: inactive (dead) since Tue 2020-06-09 18:14:09 CDT; 7min ago
TriggeredBy: ● lorri.socket
Main PID: 2100 (code=killed, signal=TERM)
Jun 09 18:12:43 nixos systemd[1853]: Started lorri build daemon.
Jun 09 18:12:43 nixos lorri[2100]: Jun 09 18:12:43.338 INFO ready
Jun 09 18:13:24 nixos lorri[2100]: Jun 09 18:13:24.154 INFO build status, messa>
Jun 09 18:13:33 nixos lorri[2100]: Jun 09 18:13:33.297 INFO build status, messa>
Jun 09 18:13:41 nixos lorri[2100]: Jun 09 18:13:41.941 INFO build status, messa>
Jun 09 18:13:45 nixos lorri[2100]: Jun 09 18:13:45.933 INFO build status, messa>
Jun 09 18:14:09 nixos systemd[1853]: Stopping lorri build daemon...
Jun 09 18:14:09 nixos systemd[1853]: lorri.service: Succeeded.
Jun 09 18:14:09 nixos systemd[1853]: Stopped lorri build daemon.
I am seeing a similar thing as @codygman
In my case there is no socket nor service even after switching to services.lorri.enable = true
and restarting.
Using nixos-unstable.
pepe@arrow ~> systemctl --user list-units | grep lorri
pepe@arrow ~> systemctl --user list-sockets --all
LISTEN UNIT ACTIVATES
/run/user/1000/bus dbus.socket dbus.service
/run/user/1000/pipewire-0 pipewire.socket pipewire.service
/run/user/1000/pulse/native pulseaudio.socket pulseaudio.service
We are swallowing varlink errors in the check, so I assume there is a varlink error message here that we silently ignore:
Lines 24 to 32 in 93d9301
cc @curiousleo @nyarly
We should really pay attention to not swallow any errors, it usually hides important bugs down the road.
Lemme write a patch that shows what is going on.
Please check out #441 and try reproducing the problem after running lorri self-upgrade local /path/to/lorri/checkout
@codygman, @pepeiborra. It should print a better stacktrace then and we can debug what the problem is.
lorri self-upgrade won't work for me.
I updated to the latest with niv though in this commit. It seems to me that for some reason the new symbolic link to lorri isn't being loaded into systemd. This is similar to me in how I have to restart emacs after installing packages with nix because it's a wrapper that modifes the load path. I'm not sure the way to workaround this issue here though.
Here are the results of debugging I mostly pasted from this commit:
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ journalctl --user -u lorri -f
-- Logs begin at Sun 2019-11-17 17:48:35 CST. --
Jul 12 15:27:31 nixos systemd[1524]: Started lorri build daemon.
Jul 12 15:27:31 nixos systemd[17166]: lorri.service: Failed to execute command: No such file or directory
Jul 12 15:27:31 nixos systemd[17166]: lorri.service: Failed at step EXEC spawning /nix/store/94phsz8nvlmbm0gsp5hrkkmmaark4jia-source/bin/lorri: No such file or directory
Jul 12 15:27:31 nixos systemd[1524]: lorri.service: Main process exited, code=exited, status=203/EXEC
Jul 12 15:27:31 nixos systemd[1524]: lorri.service: Failed with result 'exit-code'.
Jul 12 15:27:31 nixos systemd[1524]: lorri.service: Scheduled restart job, restart counter is at 5.
Jul 12 15:27:31 nixos systemd[1524]: Stopped lorri build daemon.
Jul 12 15:27:31 nixos systemd[1524]: lorri.service: Start request repeated too quickly.
Jul 12 15:27:31 nixos systemd[1524]: lorri.service: Failed with result 'exit-code'.
Jul 12 15:27:31 nixos systemd[1524]: Failed to start lorri build daemon.
Indeed, that doesn't exist:
[cody@nixos:~/hci]$ ls /nix/store/94phsz8nvlmbm0gsp5hrkkmmaark4jia-source/bin/lorri
ls: cannot access '/nix/store/94phsz8nvlmbm0gsp5hrkkmmaark4jia-source/bin/lorri': No such file or directory
Here's the lorri unit file:
[cody@nixos:~/hci]$ systemctl --user cat lorri
# /nix/store/nm14xpfpd2pirj1v2pvpd1kwk1yi02mh-home-manager-files/.config/systemd/user/lorri.service
[Service]
Environment=PATH=/nix/store/15wpim5v22189mrfrm6n8ggg410n3rvi-nix-2.3.6/bin:/nix/store/r1jr2v6afyhc19yx9i1ck230x9987c6q-git-2.27.0/bin:/nix/store/ldh4iwq6dxsv6cj78gv060wq9xa17n1h-gnutar-1.32/bin:/nix/store/yir8n5jfmcdmf0hyk712kl97krmdshlg-gzip-1.10/bin
ExecStart=/nix/store/7988ydmqdwk1p64sjysmi06bq523nbac-lorri/bin/lorri daemon
PrivateTmp=true
ProtectHome=read-only
ProtectSystem=strict
Restart=on-failure
[Unit]
After=lorri.socket
Description=lorri build daemon
RefuseManualStart=true
Requires=lorri.socket
Hmm:
ExecStart=/nix/store/7988ydmqdwk1p64sjysmi06bq523nbac-lorri/bin/lorri daemon
Well.. that is right
[cody@nixos:~/hci]$ /nix/store/7988ydmqdwk1p64sjysmi06bq523nbac-lorri/bin/lorri --version
lorri 1.1.0
So the errors above aren't relevant... so then why isn't the socket
activating? Let's filter the logs:
[cody@nixos:~/hci]$ journalctl --user -u lorri --since='5 minutes ago'
-- Logs begin at Sun 2019-11-17 17:48:35 CST, end at Sun 2020-07-12 15:42:36 CDT. --
-- No entries --
I cd into a directory that uses lorri:
[cody@nixos:~/hci/testdata]$ cd haskell-nix-stack-workflow/
direnv: loading ~/hci/testdata/haskell-nix-stack-workflow/.envrc
Jul 12 15:45:51.351 WARN lorri daemon is not running and this project has not yet been evaluated, please run `lorri daemon`, expr: /home/cody/hci/testdata/haskell-nix-stack-workflow/shell.nix
direnv: export +IN_NIX_SHELL
Okay... direnv allow in my experince has never failed to ping lorri so
I'll try that:
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ direnv allow
direnv: loading ~/hci/testdata/haskell-nix-stack-workflow/.envrc
Jul 12 15:46:17.355 WARN lorri daemon is not running and this project has not yet been evaluated, please run `lorri daemon`, expr: /home/cody/hci/testdata/haskell-nix-stack-workflow/shell.nix
direnv: export +IN_NIX_SHELL
Surely we've generated some logs by now:
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ journalctl --user -u lorri --since='5 minutes ago'
-- Logs begin at Sun 2019-11-17 17:48:35 CST, end at Sun 2020-07-12 15:42:36 CDT. --
-- No entries --
Okay.. what if we do a reload as recommended in #374 (comment)
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ systemctl --user daemon-reload
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ journalctl --user -u lorri --since='5 minutes ago'
-- Logs begin at Sun 2019-11-17 17:48:35 CST, end at Sun 2020-07-12 15:47:23 CDT. --
-- No entries --
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ direnv allow
direnv: loading ~/hci/testdata/haskell-nix-stack-workflow/.envrc
Jul 12 15:47:37.141 WARN lorri daemon is not running and this project has not yet been evaluated, please run `lorri daemon`, expr: /home/cody/hci/testdata/haskell-nix-stack-workflow/shell.nix
direnv: export +IN_NIX_SHELL
[cody@nixos:~/hci/testdata/haskell-nix-stack-workflow]$ journalctl --user -u lorri --since='5 minutes ago'
-- Logs begin at Sun 2019-11-17 17:48:35 CST, end at Sun 2020-07-12 15:47:23 CDT. --
-- No entries --
No luck.
After rebooting things worked just fine and it doesn't look like anything changed:
[cody@nixos:~/hci]$ systemctl --user cat lorri
# /nix/store/nm14xpfpd2pirj1v2pvpd1kwk1yi02mh-home-manager-files/.config/systemd/user/lorri.service
[Service]
Environment=PATH=/nix/store/15wpim5v22189mrfrm6n8ggg410n3rvi-nix-2.3.6/bin:/nix/store/r1jr2v6afyhc19yx9i1ck230x9987c6q-git-2.27.0/bin:/nix/store/ldh4iwq6dxsv6cj78gv060wq9xa17n1h-gnutar-1.32/bin:/nix/store/yir8n5jfmcdmf0hyk712kl97krmdshlg-gzip-1.10/bin
ExecStart=/nix/store/7988ydmqdwk1p64sjysmi06bq523nbac-lorri/bin/lorri daemon
PrivateTmp=true
ProtectHome=read-only
ProtectSystem=strict
Restart=on-failure
[Unit]
After=lorri.socket
Description=lorri build daemon
RefuseManualStart=true
Requires=lorri.socket
Maybe this is more of a systemd/nixos bug? Maybe NixOS/nixpkgs#49528 is related here.
In the README it's suggested to do
sudo systemctl --user daemon-reload
after adding lorri
to the system environment. But I get this when typing in the command:
Failed to connect to bus: No such file or directory
(Using nixos-unstable and lorri installed is v1.2.0 if it matters)
I did not check if the whole thing works, but the fact that there's an error message when applying recipe from README is worrying.