prometheus/client_golang

Index out of range error in native histogram

krajorama opened this issue ยท 2 comments

Version github.com/prometheus/client_golang v1.20.0

When testing trivial solution for #1605 ๐Ÿ‘

 go test -count 1 -race -timeout 300s -run ^TestNativeHistogramConcurrency$ github.com/prometheus/client_golang/prometheus
panic: runtime error: index out of range [10] with length 10

goroutine 10 [running]:
github.com/prometheus/client_golang/prometheus.(*nativeExemplars).addExemplar(0xc0002307c8, 0xc00055f4f0)
	/home/krajo/go/github.com/krajorama/client_golang/prometheus/histogram.go:1791 +0x1969
github.com/prometheus/client_golang/prometheus.(*histogram).updateExemplar(0xc000230700, 0x3ff22643022060a2, 0x0, 0xc00054fcb0)
	/home/krajo/go/github.com/krajorama/client_golang/prometheus/histogram.go:1140 +0x117
github.com/prometheus/client_golang/prometheus.(*histogram).ObserveWithExemplar(0xb64180?, 0xc00054fcb0?, 0xbe246f?)
	/home/krajo/go/github.com/krajorama/client_golang/prometheus/histogram.go:770 +0x6a
github.com/prometheus/client_golang/prometheus.TestNativeHistogramConcurrency.func1.2({0xc0005d0000, 0x46e4, 0x420820?})
	/home/krajo/go/github.com/krajorama/client_golang/prometheus/histogram_test.go:1058 +0x17b
created by github.com/prometheus/client_golang/prometheus.TestNativeHistogramConcurrency.func1 in goroutine 21
	/home/krajo/go/github.com/krajorama/client_golang/prometheus/histogram_test.go:1050 +0x325
FAIL	github.com/prometheus/client_golang/prometheus	2.189s

I think this is probably due to the fact that we have:

nIdx = len(n.exemplars)

Which instructs the code to insert after the end of the array. Usually this would not end up in rIdx, however in a concurrent environment the timestamp of the last inserted exemplar might be newer than the currently inserted exemplar. So we don't go into this branch:

if n.exemplars[nIdx-1].Timestamp.AsTime().Before(e.Timestamp.AsTime()) {

I found related issues that might be helpful:

  • #1605
    This issue was referenced by the current issue you're encountering and is still open. It discusses a data race in the native histogram, which might be related to the index out of range error you're seeing.

  • #1607
    This issue does not have any comments but is directly related to the index out of range error in the native histogram. It might provide additional context or be a place to track updates related to your problem.

To continue talking to Dosu, mention @dosu.

fixed