goref panic when attach to a process with about 500M memory usage
oilbeater opened this issue · 15 comments
Describe the bug
goref panic when I attach to k3s to inspect the memory usage.
To Reproduce
Steps to reproduce the behavior:
- build k3s with DEBUG mode to enable the DWARF
- run ks3 server
- grf attach ${pid}
It panic with this output:
2024-07-23T05:57:20Z error layer=debugger could not resolve parametric type of s: parametric type without a dictionary
2024-07-23T05:57:20Z error layer=debugger could not resolve parametric type of s: parametric type without a dictionary
panic: runtime error: makeslice: len out of range
goroutine 1 [running]:
github.com/cloudwego/goref/pkg/proc.cacheMemory({0xa7d078, 0xc0001a4e70}, 0x0, 0xf876c00000000)
/root/workspace/github.com/goref/pkg/proc/mem.go:79 +0x145
github.com/cloudwego/goref/pkg/proc.(*HeapScope).readType(0xc01cd763c0, 0xc00c3469c0, 0xc003e64734, 0xc003d79c88, 0xc003d7a200)
/root/workspace/github.com/goref/pkg/proc/heap.go:330 +0x173
github.com/cloudwego/goref/pkg/proc.(*HeapScope).copyGCMask(0xc01cd763c0, 0xc00c3469c0, 0xc003d79c80)
/root/workspace/github.com/goref/pkg/proc/heap.go:302 +0x96
github.com/cloudwego/goref/pkg/proc.(*ObjRefScope).findObject(0xc054c1ba78, 0xc003d79f28, {0xa81d00, 0xc00046e080}, {0xa7cdf8, 0xc01b581600})
/root/workspace/github.com/goref/pkg/proc/objects.go:69 +0xfd
github.com/cloudwego/goref/pkg/proc.(*ObjRefScope).findRef(0xc054c1ba78, 0xc0280866e0, 0x0)
/root/workspace/github.com/goref/pkg/proc/objects.go:168 +0xe77
github.com/cloudwego/goref/pkg/proc.ObjectReference(0xc00012e0f0, {0x9c2328, 0x7})
/root/workspace/github.com/goref/pkg/proc/objects.go:449 +0xc35
github.com/cloudwego/goref/cmd/grf/cmds.execute(0xd2e2, {0x0, 0x0}, {0x0, 0x0}, {0x9c2328, 0x7}, 0xc00018e500)
/root/workspace/github.com/goref/cmd/grf/cmds/commands.go:139 +0x2ac
github.com/cloudwego/goref/cmd/grf/cmds.attachCmd(0xc0001ba100?, {0xc00003ebc0?, 0x1?, 0x9beeb9?})
/root/workspace/github.com/goref/cmd/grf/cmds/commands.go:108 +0xf9
github.com/spf13/cobra.(*Command).execute(0xc000027208, {0xc00003eb70, 0x1, 0x1})
/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:987 +0xab1
github.com/spf13/cobra.(*Command).ExecuteC(0xc000026f08)
/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(0xc0000061c0?)
/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039 +0x13
main.main()
/root/workspace/github.com/goref/cmd/grf/main.go:22 +0x1a
I manually disable the cache by set the cacheEnabled
in the source code to false
, then I got this panic:
2024-07-23T06:12:05Z error layer=debugger could not resolve parametric type of s: parametric type without a dictionary
2024-07-23T06:12:05Z error layer=debugger could not resolve parametric type of s: parametric type without a dictionary
panic: runtime error: index out of range [32] with length 32
goroutine 1 [running]:
github.com/cloudwego/goref/pkg/proc.(*HeapScope).readType(0xc0205da1e0, 0xc01c0da9c0, 0xc003e64734, 0xc003d79c88, 0xc003d7a200)
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/pkg/proc/heap.go:352 +0x34d
github.com/cloudwego/goref/pkg/proc.(*HeapScope).copyGCMask(0xc0205da1e0, 0xc01c0da9c0, 0xc003d79c80)
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/pkg/proc/heap.go:302 +0x96
github.com/cloudwego/goref/pkg/proc.(*ObjRefScope).findObject(0xc05e20fa78, 0xc003d79f28, {0xa81b40, 0xc01fb18340}, {0xa7cc38, 0xc005aee480})
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/pkg/proc/reference.go:69 +0xf5
github.com/cloudwego/goref/pkg/proc.(*ObjRefScope).findRef(0xc05e20fa78, 0xc00726e0f0, 0x0)
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/pkg/proc/reference.go:168 +0xe77
github.com/cloudwego/goref/pkg/proc.ObjectReference(0xc0000bc0f0, {0x9c21e8, 0x7})
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/pkg/proc/reference.go:449 +0xc35
github.com/cloudwego/goref/cmd/grf/cmds.execute(0xd2e2, {0x0, 0x0}, {0x0, 0x0}, {0x9c21e8, 0x7}, 0xc000198500)
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/cmd/grf/cmds/commands.go:139 +0x2ac
github.com/cloudwego/goref/cmd/grf/cmds.attachCmd(0xc0001e0100?, {0xc000118b80?, 0x1?, 0x9bed79?})
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/cmd/grf/cmds/commands.go:108 +0xf9
github.com/spf13/cobra.(*Command).execute(0xc000126f08, {0xc000118b30, 0x1, 0x1})
/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:987 +0xab1
github.com/spf13/cobra.(*Command).ExecuteC(0xc000126c08)
/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(0xc0000061c0?)
/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039 +0x13
main.main()
/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/cmd/grf/main.go:22 +0x1a
Expected behavior
goref can generate the flamegraph
Screenshots
If applicable, add screenshots to help explain your problem.
Goref version:
The master commit.
Environment:
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/root/.cache/go-build'
GOENV='/root/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/root/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/root/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/root/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.2.linux-amd64'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/root/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.2.linux-amd64/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.2'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/root/go/pkg/mod/github.com/cloudwego/goref@v0.0.0-20240722091010-3519d085465e/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1310706020=/tmp/go-build -gno-record-gcc-switches'
Additional context
I am not sure if this issue is related to the large memory usage (over 500MB RES usage). If so is there any advice to how to scan the memory usage for application with lots of memory inuse?
I am not sure if this issue is related to the large memory usage (over 500MB RES usage). If so is there any advice to how to scan the memory usage for application with lots of memory inuse?
500Mb of res memory is ok. Theoretically, goref has no memory usage limitations. This problem is likely due to a bug in the support for go1.22 allocation header feature. I will analyze this issue. You can temporarily use "GOEXPERIMENT=noallocheaders" during compilation to remove the allocation header feature and analyze it first.
Thanks for the reply.
"GOEXPERIMENT=noallocheaders"
After enabling this option, the goref no longer crashes. However, this time it fails to complete within 30 minutes. Additionally, the goref process consistently utilizes 1.5 CPU cores, despite the presence of 8 cores on my machine. It appears that I am stuck in a loop or some sort of repetitive process.
And this message still exist:
2024-07-23T06:12:05Z error layer=debugger could not resolve parametric type of s: parametric type without a dictionary
2024-07-23T06:12:05Z error layer=debugger could not resolve parametric type of s: parametric type without a dictionary
Could you give me a executable file and a core file generated by gcore
command? I'd like to reproduce it in my environment.
Compress by tar -zcvf issue12 ./exec ./core.xx
before send them.
It may also be because the cache is closed. You can open the cache to try again.
Ah, it's my fault. After enable the cache it runs to finish in seconds. Thanks!
Could you please provide the executable file and the core file? I can't reproduce the issue on my testing service. Or may I ask have you changed the go source code? Since I think it's unlikely to occur to the original go code.
May I ask which version of Go did you use to compile k3s? @oilbeater
@jayantxie I push a fork of k3s with my edit here https://github.com/oilbeater/k3s/commit/22433fe8e025501a6ad0ff057009aed6c0f650f5
It use 1.22.4 to compile. You can try the build step here https://github.com/oilbeater/k3s/blob/main/BUILDING.md with
mkdir -p build/data && make download && make generate
SKIP_VALIDATE=true make
then
./dist/artifacts/k3s server
to start the server.
The k3s use lots of build options and zstd to optimize binary size, maybe some options conflict with goref.
I upload the binary and core here https://github.com/oilbeater/k3s/releases/tag/issue
go build -ldflags="-s -w"
Did you build k3s with such build flags? If so, you could remove it since we can't get debug info from the executable file :(
Looks like there are some changes from the upstream I don't know that affect the build. I try again to remove the ldflags and add GOEXPERIMENT=noallocheaders
here https://github.com/oilbeater/k3s/releases/download/issue/issue12
It should contains the debug_info now:
file k3s
k3s: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=CDXATxDA4Z1gGwhpmDVa/9QqoWhX95GDV5BzCafog/1Lk0SLTATqRJBUM99qa6/o8REHiUI9JW65O03UrvW, with debug_info, not stripped
However this time I run goref and it can not run to finish again.
Sorry, I was still trying to modify the build option when dump the core to make goref work. Here is the new core file https://github.com/oilbeater/k3s/releases/download/issue/issue13
Fixed, you could try again.