prometheus/client_golang

A potential goroutine memory leak

xuxiaofan1203 opened this issue · 2 comments

Hello, I found a potential bug when I used the project, I'm not sure I'm right, we can discuss the problem to avoid a potential trouble possibly.

Blocking position:


At line 449 call the wg.Add(goroutineBudget), and wg.Wait() is blocked until wg.Done() is called the number of goroutineBudget times to awaken the wg.Wait().
But if the select statement chooses the default path, return the function, maybe the wg.Done() has not executed enough times to awaken the wg.Wait(), which can result in a goroutine leak
collectWorker := func() {
for {
select {
case collector := <-checkedCollectors:
collector.Collect(checkedMetricChan)
case collector := <-uncheckedCollectors:
collector.Collect(uncheckedMetricChan)
default:
return
}
wg.Done()
}
}

Complete codes of the part
wg.Add(goroutineBudget)
collectWorker := func() {
for {
select {
case collector := <-checkedCollectors:
collector.Collect(checkedMetricChan)
case collector := <-uncheckedCollectors:
collector.Collect(uncheckedMetricChan)
default:
return
}
wg.Done()
}
}
// Start the first worker now to make sure at least one is running.
go collectWorker()
goroutineBudget--
// Close checkedMetricChan and uncheckedMetricChan once all collectors
// are collected.
go func() {
wg.Wait()
close(checkedMetricChan)
close(uncheckedMetricChan)
}()

I guess we could test this behavior using the runtime package? More precisely, using runtime.NumGoroutine()(https://pkg.go.dev/runtime#NumGoroutine)

Verify the number of goroutines, execute the function you believe there is a leak, and verify the number of go routines again. If numbers are different, it means we have a leak :)

As @kakkoyun suggested, we can double check with https://github.com/uber-go/goleak. Help wanted to add this (if not added before) (: