golang/go

x/tools/gopls: modernize breaks benchmarks when b.N is used to prepare testdata

Closed this issue · 4 comments

bep commented

Go version

go version go1.25.3 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/bep/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/bep/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/_g/j3j21hts4fn7__h04w2x8gb40000gn/T/go-build2010589526=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/bep/dev/go/bep/debounce/go.mod'
GOMODCACHE='/Users/bep/dev/gomod_cache'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/bep/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='on'
GOTELEMETRYDIR='/Users/bep/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.25.3'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

Ran modernize -fix ./... on a repo with the benchmark below in it (made up example):

func BenchmarkSort(b *testing.B) {
	slices := make([][]int, b.N)
	for i := 0; i < b.N; i++ {
		slices[i] = []int{5, 3, 4, 1, 2}
	}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		slice := slices[i]
		// Simple bubble sort for demonstration purposes
		for j := 0; j < len(slice); j++ {
			for k := 0; k < len(slice)-j-1; k++ {
				if slice[k] > slice[k+1] {
					slice[k], slice[k+1] = slice[k+1], slice[k]
				}
			}
		}
	}
}

What did you see happen?

func BenchmarkSort(b *testing.B) {
	slices := make([][]int, b.N)
	for i := 0; b.Loop(); i++ {
		slices[i] = []int{5, 3, 4, 1, 2}
	}

	for i := 0; b.Loop(); i++ {
		slice := slices[i]
		// Simple bubble sort for demonstration purposes
		for j := range slice {
			for k := 0; k < len(slice)-j-1; k++ {
				if slice[k] > slice[k+1] {
					slice[k], slice[k+1] = slice[k+1], slice[k]
				}
			}
		}
	}
}

Which when run, fails:

BenchmarkSort-10        panic: runtime error: index out of range [1] with length 1

goroutine 6 [running]:
github.com/bep/debounce_test.BenchmarkSort(0x140000ea588)
        /Users/bep/dev/go/bep/debounce/debounce_test.go:163 +0x228
testing.(*B).runN(0x140000ea588, 0x1)
        /usr/local/go/src/testing/benchmark.go:219 +0x184
testing.(*B).run1.func1()
        /usr/local/go/src/testing/benchmark.go:245 +0x4c
created by testing.(*B).run1 in goroutine 1
        /usr/local/go/src/testing/benchmark.go:238 +0x88
exit status 2

What did you expect to see?

No error.

This benchmark was invalid even before the transformation. The parameter N should be used only to determine the number of repeated calls of the measured operation. The testing package assumes the the operation is logically the same each time so that it can extrapolate and interpolate according to a linear model relating execution time to N. However, your benchmark also uses N to vary the size of the input to the measured operation (which is quadratic in, not independent of, N), breaking this linear relationship. So the results of the benchmark are meaningless.

I've made the same mistake myself. We should have a vet check that detects it. However, this is not a bug in modernize.

bep commented

However, your benchmark also uses N to vary the size of the input to the measured operation (which is quadratic in, not independent of, N), breaking this linear relationship. So the results of the benchmark are meaningless.

Just to be clear: The N is not used to vary the size of the input to sort in the above (the length of the slice is fixed at 5 elments). The benchmark is also just a dummy benchmark created by GitHub Copilot for this purpose (and since AI autocompletes the pattern above shows that this is not uncommon.

I do agree that using b.N like the above is a antipattern, though, and I would welcome a vet validation.

Just to be clear: The N is not used to vary the size of the input to sort in the above (the length of the slice is fixed at 5 elments). The benchmark is also just a dummy benchmark created by GitHub Copilot for this purpose (and since AI autocompletes the pattern above shows that this is not uncommon.
I do agree that using b.N like the above is a antipattern, though, and I would welcome a vet validation.

You're quite right; apologies for reading in such haste. Now that I have understood your example, I realize that the analyzer I just proposed in #76201 would report it as a false positive.

I think the bug you are reporting is effectively a duplicate of #75599, which is now fixed at master. Try cloning x/tools, cd gopls && go install, to pick up the fix. (You can't just go install ...gopls@latest)