Leaking goroutine found because of init() dbus call
crobert-1 opened this issue · 3 comments
Context
Hello, I'm currently working on adding goleak
to tests to ensure no goroutines are leaking.
Bug
goleak
returned the following stack trace:
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 13 in state IO wait, with internal/poll.runtime_pollWait on top of the stack:
internal/poll.runtime_pollWait(0x7f3c57dcb128, 0x72)
/opt/hostedtoolcache/go/1.[20](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7507762922/job/20442043072?pr=30492#step:9:21).12/x64/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc000365a98, 0xc000641020?, 0x0)
/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_poll_runtime.go:84 +0xbd
internal/poll.(*pollDesc).waitRead(...)
/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).ReadMsg(0xc000365a80, {0xc000641020, 0x10, 0x10}, {0xc000644020, 0x1000, 0x1000}, 0xc00065c470?)
/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_unix.go:304 +0x552
net.(*netFD).readMsg(0xc000365a80, {0xc000641020, 0x10, 0x10}, {0xc000644020, 0x1000, 0x1000}, 0x5?)
/opt/hostedtoolcache/go/1.20.12/x64/src/net/fd_posix.go:78 +0xb4
net.(*UnixConn).readMsg(0xc0007a6c18, {0xc000641020, 0x10, 0x10}, {0xc000644020, 0x1000, 0x1000})
/opt/hostedtoolcache/go/1.20.12/x64/src/net/unixsock_posix.go:115 +0xc5
net.(*UnixConn).ReadMsgUnix(0xc0007a6c18, {0xc000641020, 0x10, 0x10}, {0xc000644020, 0x1000, 0x1000})
/opt/hostedtoolcache/go/1.20.12/x64/src/net/unixsock.go:143 +0xc5
github.com/godbus/dbus.(*oobReader).Read(0xc000644000, {0xc000641020, 0x10, 0x10})
/home/runner/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2/transport_unix.go:[21](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7507762922/job/20442043072?pr=30492#step:9:22) +0xa8
io.ReadAtLeast({0xe0[25](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7507762922/job/20442043072?pr=30492#step:9:26)b00, 0xc000644000}, {0xc000641020, 0x10, 0x10}, 0x10)
/opt/hostedtoolcache/go/1.20.12/x64/src/io/io.go:332 +0xde
io.ReadFull(...)
/opt/hostedtoolcache/go/1.20.12/x64/src/io/io.go:351
github.com/godbus/dbus.(*unixTransport).ReadMessage(0xc00079f5f0)
/home/runner/go/pkg/mod/github.com/godbus/dbus@v0.0.0-201907[26](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7507762922/job/20442043072?pr=30492#step:9:27)142602-4481cbc300e2/transport_unix.go:91 +0x2c6
github.com/godbus/dbus.(*Conn).inWorker(0xc0007fe6c0)
/home/runner/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2/conn.go:[29](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7507762922/job/20442043072?pr=30492#step:9:30)4 +0x6b
created by github.com/godbus/dbus.(*Conn).Auth
/home/runner/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc[30](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7507762922/job/20442043072?pr=30492#step:9:31)0e2/auth.go:118 +0xcd2
]
After investigation I found that this is caused by the following line:
Line 24 in 929f178
dbus.SessionBus()
creates a goroutine in the background when called. Since this is called within init
, this goroutine is started whether or not the 99designs/keyring repository is used.
Solution
It would be best to only call the aforementioned method when required, outside of init. Also, a public API for closing the session should be available for consumers on shutdown.
Hey @crobert-1, looks like this is related to #103, there's a bunch of linked issues that make the same conclusion as you but there doesn't appear to be an intent to fix this from the maintainers (or any communication at all). It's extremely unfortunate that this bug is within an init function as it means any dependency that pulls this package in will introduce this leak, and it's everywhere.
We've had this package sneak into our builds from two different dependencies over the years, more context on our own investigation and (kind of) fix: redpanda-data/connect#1184. I ended up using a replace directive with a modified version of the keyring package. This only works because we're not actually using this package for anything, and it means anyone using your own package as a library will need to also add the replace directive unless they're okay with the leaks.
Very yucky.