microsoft/WSL

dbus doesn't seem to work

tycho opened this issue ยท 26 comments

tycho commented

Well, first I can't start it normally because PID 1 isn't really upstart, and nothing's listening on the upstart socket:

# service dbus start
initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused

But even if I start the D-Bus daemon manually, like this:

# mkdir -p /run/dbus
# chown messagebus:messagebus /run/dbus
# dbus-uuidgen --ensure
# dbus-daemon --system --fork
# ls -l /run/dbus/
ls: /run/dbus/system_bus_socket: Invalid argument
total 0
-rw-r--r-- 1 root root 5 May 15 13:12 pid
srwxrwxrwx 1 root root 0 May 15 13:12 system_bus_socket

It doesn't respond as I'd expect it to when I try to interact with the message bus:

# dbus-monitor --system --monitor
Failed to open connection to system bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

It looks like a sendmsg breaks somehow (below output wrapped at 80 columns for readability, because GitHub just adds a scrollbar otherwise, and makes the return value annoying to find):

# strace -s 65535 -f dbus-monitor --system --monitor | fold -w 80
strace: Test for PTRACE_O_TRACESYSGOOD failed, giving up using this feature.
execve("/usr/bin/dbus-monitor", ["dbus-monitor", "--system", "--monitor"], [/* 7
0 vars */]) = 0
[ ... trimmed out shared library loads and program initialization ... ]
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/dbus/system_bus_socket"}, 33)
 = 0
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
geteuid()                               = 0
getsockname(3, {sa_family=AF_LOCAL, NULL}, [2]) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1
sendto(3, "AUTH EXTERNAL 30\r\n", 18, MSG_NOSIGNAL, NULL, 0) = 18
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}])
read(3, "OK 99eed480d5eda4f5f283f0425738d840\r\n", 2048) = 37
poll([{fd=3, events=POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "NEGOTIATE_UNIX_FD\r\n", 19, MSG_NOSIGNAL, NULL, 0) = 19
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}])
read(3, "AGREE_UNIX_FD\r\n", 2048)      = 15
poll([{fd=3, events=POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "BEGIN\r\n", 7, MSG_NOSIGNAL, NULL, 0) = 7
poll([{fd=3, events=POLLIN|POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOU
T}])
sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"l\1\0\1\0\0\0\0\1\0\0\0n\0\0\0\1\1o\
0\25\0\0\0/org/freedesktop/DBus\0\0\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\
0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\3\1s\0\5\0\0\0Hello\0\0\0", 128},
{"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EINVAL (Invalid arg
ument)
close(3)                                = 0
clock_gettime(CLOCK_MONOTONIC, {2738, 696180000}) = 0
write(2, "Failed to open connection to system bus: Did not receive a reply. Poss
ible causes include: the remote application did not send a reply, the message bu
s security policy blocked the reply, the reply timeout expired, or the network c
onnection was broken.\n", 252Failed to open connection to system bus: Did not re
ceive a reply. Possible causes include: the remote application did not send a re
ply, the message bus security policy blocked the reply, the reply timeout expire
d, or the network connection was broken.
) = 252
exit_group(1)                           = ?
<... exit_group resumed> _exit returned!
)              = ? ([{fd=3, revents=POLLOUT}])
+++ exited with 1 +++

Thanks for reporting, dbus is on our feature backlog. I took a look at your repro and it looks like there's a fair amount of missing surface area in our socket implementation to get dmsg working correctly. I've add details to our internal bug.

@tycho - Apart from what @benhillis has mentioned, the specific problem here is that our UNIX socket sendmsg implementation does not handle an IO vector length of > 1. Thanks for the trace, they are almost always useful.

For reference / in case it saves anyone else time: dbus can be configured to use TCP instead of UNIX domain sockets. Unfortunately (though unsurprisingly), doing so doesn't help; strace indicates that it hits the same issue with sendmsg(..., msg_iov(2)=...).

Yes, INET sockets also lack this capability as you have discovered :). I have opened a bug to track the support for IO vector length > 1 for sendmsg/recvmsg.

Paths are incorrect in /etc/init.d/dbus.

DAEMON=/usr/bin/dbus-daemon
UUIDGEN=/usr/bin/dbus-uuidgen

Actually,

$ which dbus-daemon
/bin/dbus-daemon
$ which dbus-uuidgen
/bin/dbus-uuidgen

Has something changed on this issue? After reading this Reddit thread I tried switching dbus configuration to use tcp sockets instead of unix sockets, but unfortunately despite no apparent network errors I still ran into some issues:

root@DESKTOP-819ATN1:~# systemctl restart dbus
Failed to connect to bus: No such file or directory
root@DESKTOP-819ATN1:~#

Here is the full strace: http://sprunge.us/NYYM. Apparently, dbus still looked for a file named /var/run/dbus/system_bus_socket.

With better support for AF_LOCAL socket, dbus seems to be happy and the dbus workarounds to use TCP sockets will hopefully no longer be needed. The changes haven't made it to the release branch yet, but are moving fast. Keep an eye out on the release notes.

wench commented

DBUS is triggering the socket related BSOD in 14936. Makes it completely unusable.

@wench The fix for the BSOD is coming soon. Apologize for the inconvenience.

wench commented

Maybe i should prod/beg Dona and get her to release a build that has it fixed

Thanks for reporting. Closing since this issue should have been resolved in an insider build late last year/ early this year.

Not sure how much it's related but I've just had a fresh install of WSL and then I installed a server environment from script on the new Xenial WSL and the installation of everything seemed to go fine but I did get this on the end:

Failed to connect to bus: No such file or directory.

@Benqzq - Thanks for the report. Can you try /etc/init.d/dbus start and then rerun your repro? Also, which windows build are you running? You can dump the output of the ver command from a Windows command prompt.

My build is 15063.138 . By rerun your repo you mean to fully uninstall all of my server environment, execute the code and then reinstall the server environment?

@Benqzq - No, no! Just what is failing, if there are any specific scenario that is failing for you.

Oh, I can't tell, this is a script of about 90 rows. I don't know to tell in regards to what this error comes as it's just appear in the end of the long output (sorry if it sounds wired, I am quite new to Linux).

Ok. If you see that error happen again for a specific scenario, you can try the steps above.

@sunilmut ... I got this again after executing:

		cat <<-'LAMPENV' >> /etc/apache2/apache2.conf
                ...
		LAMPENV
		systemctl restart apache2.service

Note that I mistakenly executed systemctl restart apache2.service together with the heredoc.

I then ran /etc/init.d/dbus start (without the last systemctl restart apache2.service) and tried again and this time it worked fine.

If I understand correct, in the current release of WSL, doesn't include automatic starting of D-bus.

@Benqzq - Glad to know that its working for you now. Your understanding is correct. Currently, WSL doesn't "boot" per se, i.e. it doesn't start the default services that you would find in Linux such as dbus. It is also closely tied to the lack of support for daemon. You can see the corresponding User Voice page here. We are looking into some of these bigger problems, but, we don't have an ETA at the moment.

Just installed the WSL on my windows 10:
lsb_release -a:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.2 LTS
Release:        16.04
Codename:       xenial

Was following the docker tutorial and it is failing on the last step
sudo systemctl status docker erroring with

Failed to connect to bus: No such file or directory

And as result of that cant run docker-compose build saying

ERROR: Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?

When running docker run hello-world erroring with

docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

UPDATE: Setting the โ€œExpose daemon on tcp://localhost:2375 without TLSโ€ did help
image
so all good now.

also I did setup an alias to docker host

$ echo "export DOCKER_HOST='tcp://0.0.0.0:2375'" >> ~/.bashrc
$ source ~/.bashrc

Fresh install of OpenSuse (42.3) WSL. Tons of error messages about dbus.

# systemctl restart dbus
Failed to connect to bus: No such file or directory
( 58/102) Installing: logrotate-3.11.0-15.2.x86_64 .....................................................................................................................................................................................[done]
Additional rpm output:
Failed to connect to bus: No such file or directory

etc.

systemctl restart dbus

No systemctl on WSL. Landing zone is #994. Unrelated to dbus.

@therealkenc The question is what does rpm run when it has extra jobs to do after the installation. Is it also systemctl or something else?

@archon810 - missed your question. I don't use rpm based distros (much) but I gather at least SUSE 42 uses systemctl at least sometimes per #2941. rpm doesn't "run" systemctl per se. It depends on how the package maintainers decide to implement their postinstall scripts, based on whatever their platform (Ubuntu, SUSE, Fedora) prefers. For some definition of prefer. The meaning of "package maintainers" here is kind of fluid. Some upstream packages are fairly wed to systemd (with dbus and the rest). Others, less so. The distribution package maintainers try to bring the differences into some kind of semi-coherency.

the following solution worked for me:

https://x410.dev/cookbook/wsl/sharing-dbus-among-wsl2-consoles/

the only downside is having to put my password in a script.

the following solution worked for me:
https://x410.dev/cookbook/wsl/sharing-dbus-among-wsl2-consoles/
the only downside is having to put my password in a script.

systemd works in WSL2 from Windows Store. You do not need those kind of workaround anymore. See this article for more details https://devblogs.microsoft.com/commandline/systemd-support-is-now-available-in-wsl/