runtime: SIGSEGV in mapassign_fast64 during cmd/vet
myitcv opened this issue · 12 comments
What version of Go are you using (go version)?
$ go version go version devel +0ac8739ad5 Mon Nov 18 15:11:03 2019 +0000 linux/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE="on" GOARCH="amd64" GOBIN="" GOCACHE="/home/myitcv/.cache/go-build" GOENV="/home/myitcv/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/myitcv/gostuff" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/home/myitcv/gos" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/home/myitcv/gostuff/src/github.com/myitcv/govim/go.mod" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build967409403=/tmp/go-build -gno-record-gcc-switches"
What did you do?
I just got a random failure running tests on govim:
$ go test -short -count=1 ./...
unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x410fff]
goroutine 1 [running]:
runtime.throw(0x760145, 0x5)
/home/myitcv/dev/go/src/runtime/panic.go:1106 +0x72 fp=0xc000150c18 sp=0xc000150be8 pc=0x4316b2
runtime.sigpanic()
/home/myitcv/dev/go/src/runtime/signal_unix.go:674 +0x3cc fp=0xc000150c48 sp=0xc000150c18 pc=0x446b9c
runtime.mapassign_fast64(0x705ee0, 0x637469796d2f656d, 0x0, 0x7f256f939698)
/home/myitcv/dev/go/src/runtime/map_fast64.go:100 +0x2f fp=0xc000150c88 sp=0xc000150c48 pc=0x410fff
go/internal/gcimporter.iImportData(0xc0000fa900, 0xc0002e7e30, 0xc000480001, 0x6dd16, 0x7fdff, 0xc000016ddd, 0x3, 0x0, 0x0, 0x0, ...)
/home/myitcv/dev/go/src/go/internal/gcimporter/iimport.go:112 +0x4a8 fp=0xc000150fb0 sp=0xc000150c88 pc=0x65cb38
go/internal/gcimporter.Import(0xc0000fa900, 0xc0002e7e30, 0xc000016ddd, 0x3, 0x0, 0x0, 0xc0002f0610, 0x0, 0x0, 0x0)
/home/myitcv/dev/go/src/go/internal/gcimporter/gcimporter.go:159 +0x4ab fp=0xc0001511a8 sp=0xc000150fb0 pc=0x65bedb
go/importer.(*gcimports).ImportFrom(0xc0002f2720, 0xc000016ddd, 0x3, 0x0, 0x0, 0x0, 0xc00009b8a8, 0x72ff60, 0xc00009b8a0)
/home/myitcv/dev/go/src/go/importer/importer.go:102 +0x7c fp=0xc000151208 sp=0xc0001511a8 pc=0x6645ac
go/importer.(*gcimports).Import(0xc0002f2720, 0xc000016ddd, 0x3, 0x3, 0xc00009b928, 0x70cc01)
/home/myitcv/dev/go/src/go/importer/importer.go:95 +0x50 fp=0xc000151260 sp=0xc000151208 pc=0x6644f0
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func2(0xc000016ef1, 0x3, 0x70c680, 0xc000091e01, 0x0)
/home/myitcv/dev/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:221 +0x95 fp=0xc0001512c8 sp=0xc000151260 pc=0x6773d5
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.importerFunc.Import(0xc0002f2740, 0xc000016ef1, 0x3, 0x0, 0x0, 0x768b00)
/home/myitcv/dev/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:396 +0x3a fp=0xc000151300 sp=0xc0001512c8 pc=0x676f4a
go/types.(*Checker).importPackage(0xc000091440, 0x28, 0xc000016ef1, 0x3, 0xc00001e1c0, 0x63, 0x0)
/home/myitcv/dev/go/src/go/types/resolver.go:158 +0x602 fp=0xc0001513e8 sp=0xc000151300 pc=0x625242
go/types.(*Checker).collectObjects(0xc000091440)
/home/myitcv/dev/go/src/go/types/resolver.go:253 +0x8c6 fp=0xc000151988 sp=0xc0001513e8 pc=0x625c96
go/types.(*Checker).checkFiles(0xc000091440, 0xc0002c6180, 0x9, 0x10, 0x0, 0x0)
/home/myitcv/dev/go/src/go/types/check.go:253 +0x95 fp=0xc0001519d8 sp=0xc000151988 pc=0x608955
go/types.(*Checker).Files(...)
/home/myitcv/dev/go/src/go/types/check.go:246
go/types.(*Config).Check(0xc0002ef000, 0xc0000240f0, 0x49, 0xc0000fa900, 0xc0002c6180, 0x9, 0x10, 0xc0002dfe00, 0x0, 0x16, ...)
/home/myitcv/dev/go/src/go/types/api.go:348 +0x134 fp=0xc000151a48 sp=0xc0001519d8 pc=0x5fd654
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run(0xc0000fa900, 0xc000094dc0, 0xc0000fa7c0, 0x6, 0x8, 0x2e746c7573, 0x7de5c0, 0x707de0, 0xc0000169f8, 0xc0000a5c10)
/home/myitcv/dev/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:235 +0x404 fp=0xc000151bf0 sp=0xc000151a48 pc=0x676514
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.Run(0x7ffc676aac5f, 0x23, 0xc0000fa7c0, 0x6, 0x8)
/home/myitcv/dev/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:131 +0x113 fp=0xc000151eb8 sp=0xc000151bf0 pc=0x675993
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.Main(0xc0000fa7c0, 0x6, 0x8)
/home/myitcv/dev/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:118 +0x25f fp=0xc000151f40 sp=0xc000151eb8 pc=0x6756af
main.main()
/home/myitcv/dev/go/src/cmd/vet/main.go:35 +0x2bd fp=0xc000151f88 sp=0xc000151f40 pc=0x6bf7ad
runtime.main()
/home/myitcv/dev/go/src/runtime/proc.go:203 +0x212 fp=0xc000151fe0 sp=0xc000151f88 pc=0x433b72
runtime.goexit()
/home/myitcv/dev/go/src/runtime/asm_amd64.s:1375 +0x1 fp=0xc000151fe8 sp=0xc000151fe0 pc=0x45f5f1
What did you expect to see?
No panic
What did you see instead?
Panic
This is coming from the cmd/vet subprocess.
The failing line is here:
go/src/go/internal/gcimporter/iimport.go
Line 112 in 95b8cbf
That map is a field on an unshared, local struct of type iimporter, and is initialized unconditionally just above:
go/src/go/internal/gcimporter/iimport.go
Line 103 in 95b8cbf
This looks like a compiler or runtime bug to me.
Marking as release-blocker to at least triage before 1.14. (If we understand the root cause better, we can reprioritize as appropriate.)
I compiled cmd/vet at the indicating commit (0ac8739) and got the exact same binary, as far as I can tell. The assembly language, with the indicated SEGV is here:
0x0000000000410ff1 <+33>: mov 0x48(%rsp),%rax
0x0000000000410ff6 <+38>: test %rax,%rax
0x0000000000410ff9 <+41>: je 0x4112f3 <runtime.mapassign_fast64+803>
0x0000000000410fff <+47>: movzbl 0x8(%rax),%ecx <====================SEGV
So we are loading the argument h (type hmap) into %rax and jumping to a panic if it is nil/zero (first three assembly instructions). But then when we do a load through (%rax), we are getting a SEGV and the address for the SEGV is indicated as 0 ("unexpected fault address 0x0")
@aclements Any chance that preemption is happening between these instructions, and not restoring the %rax register (i.e. changing it from non-zero to zero)? Just a thought, since I thought that there was still no pre-emption in runtime code.
Otherwise, this is pretty mysterious, since the map was just initialized above, as pointed out by @bcmills
One other thing to note is that the h arg of runtime.mapassign_fast64 in the stacktrace looks like a bogus pointer (I think) -- 0x637469796d2f656d. But I'm not sure these stack args are always right during a panic, etc. But it is definitely not zero.
Just to point out one thing that I have observed as an uninformed reporter of these bugs: I only ever see these problems immediately after (~1 sec) starting a command, e.g. go test. That is to say if I don't see an initial error (like any of those that I have reported), I won't see one.
@myitcv, what kernel version and distro are you running?
@danscales, failing to restore rax seems really unlikely. I think this is a cascade from some earlier corruption. Though my hunch is that earlier corruption has to do some register corruption.
@aclements - you have the correct details for me in #35326 (comment).
If it's relevant, I'm running this in a VMWare Fusion virtual host atop macOS 10.12.6. I can of course provide any more details you need.
Ah, thanks. I'm losing track of who's reported what. I'll add this to the super-bug (#35777).
I suspect the VMWare isn't relevant, but that's good to know.
Any chance you're able to reproduce this?
Unfortunately not. All of the instances I have reported have been totally random and unreproducible.
The other "feature" I observed was noted in #35689 (comment). i.e. I have only seen issues during the compile step of running go test. Once everything is compiled and running, no other observed runtime issues (although I will say the tests in question are exactly heavily stressing the Go runtime).
Just as a follow up: setting GOCACHE=$(mktemp -d) does allow me to relatively reliably reproduce a variant of the version skew issue, I haven't been able to reproduce one of these "others".
Thanks. Since this particular failure doesn't seem to be reproducible, closing in favor of the super-bug.