google/pprof

Drops in numbers with -alloc_space option

sangelovic opened this issue · 3 comments

What version of pprof are you using?

I did /usr/local/go/bin/go install github.com/google/pprof@latest. My go env output is:

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/one/.cache/go-build"
GOENV="/home/one/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/one/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/one/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3433664934=/tmp/go-build -gno-record-gcc-switches"

What operating system and processor architecture are you using?

Cross compiling for aarch64, pprof running on Debian 11 x86-64.

What did you do?

I did four consecutive memory allocations dumps from my application using tcmalloc. Then I analyzed the dump files:

cd /home/one/go/bin
./pprof -top -sample_index=alloc_objects <path-to-app-with-debug-symbols> 0001.heap
./pprof -top -sample_index=alloc_objects <path-to-app-with-debug-symbols> 0002.heap
./pprof -top -sample_index=alloc_objects <path-to-app-with-debug-symbols> 0003.heap
./pprof -top -sample_index=alloc_objects <path-to-app-with-debug-symbols> 0004.heap

What did you expect to see?

I expected to see ever-increasing (or at least the same) count of allocations inside. For example, in output from 0003.heap:

flat  flat%   sum%        cum   cum%
1107  2.58% 79.81%       5122 11.96%  CBnPropertyContents::CBnPropertyContents

But in output from 0004.heap I get:

flat  flat%   sum%        cum   cum%
688   1.20% 81.29%       4727  8.26%  CBnPropertyContents::CBnPropertyContents

The flat number is lower than in the previous dump. The same behavior is with -alloc_space option.

What did you see instead?

The flat number in output from 0004.heap shall be the equal to or higher than 1107. Is this a bug? Or do I get the -alloc_objects/-alloc_space option wrong?

I did four consecutive memory allocations dumps from my application using tcmalloc.

Could you provide more details how you created those dumps?

Thanks for quick reply. At the beginning of main I call HeapProfilerStart("/opt/myapp"); and then in a specific function I call HeapProfilerDump("memprofile");. All four dumps are generated by that single call, at a rather quiescent time in between relatively long (~1-2 minutes) operations.

Since we are cross-compiling and are based on Yocto, I then on my host machine invoke pprof like this (thanks to its recently extended support for debug symbols it finds debug versions of libs correctly):

PPROF_BINARY_PATH=/home/projects/yocto-sdk/rootfs-dbg ./pprof -top -alloc_objects -tools=/home/projects/yocto-sdk/sysnative/usr/bin/aarch64-ccp-linux/aarch64-ccp-linux- /home/projects/yocto-sdk/rootfs/opt/myapp-with-dbg-symbols /home/projects/yocto-sdk/rootfs/opt/myapp.0001.heap

It writes this at the beginning:

Some binary filenames not available. Symbolization may be incomplete.
Try setting PPROF_BINARY_PATH to the search path for local binaries.

But it seems to symbolize all symbols successfully.

  1. Excerpt of output (the top five items) from the first dump:
      flat  flat%   sum%        cum   cum%
     18876 44.07% 44.07%      18876 44.07%  std::allocator_traits::allocate
      5149 12.02% 56.09%       9331 21.79%  CBnRunTimePrototype::CBnRunTimePrototype
      4211  9.83% 65.92%       4211  9.83%  CDatabase::CDatabase
      3344  7.81% 73.73%       3451  8.06%  std::make_unique
      1107  2.58% 79.81%       5122 11.96%  CBnPropertyContents::CBnPropertyContents
  1. Second dump:
      flat  flat%   sum%        cum   cum%
     24505 42.83% 42.83%      24505 42.83%  std::allocator_traits::allocate
      6078 10.62% 53.46%       6552 11.45%  std::make_unique
      5149  9.00% 62.46%       9331 16.31%  CBnRunTimePrototype::CBnRunTimePrototype
      4587  8.02% 70.47%       4587  8.02%  CDatabase::CDatabase
       688  1.20% 81.29%       4727  8.26%  CBnPropertyContents::CBnPropertyContents

You can see that while others are growing (or remain at least the same), CBnPropertyContents::CBnPropertyContents dropped suddenly. We can't trust this number anymore.

  1. Third dump:
      flat  flat%   sum%        cum   cum%
     29510 39.96% 39.96%      29510 39.96%  std::allocator_traits::allocate
      6332  8.57% 48.54%       7514 10.18%  std::make_unique
      5297  7.17% 55.71%       5297  7.17%  CDatabase::CDatabase
      5149  6.97% 62.68%       9331 12.64%  CBnRunTimePrototype::CBnRunTimePrototype
       703  0.95% 84.64%       4798  6.50%  CBnPropertyContents::CBnPropertyContents

CBnPropertyContents::CBnPropertyContents is slowly rising again, while other still rise continually (or remain the same).

  1. Fourth dump:
      flat  flat%   sum%        cum   cum%
     30407 43.51% 43.51%      30407 43.51%  std::allocator_traits::allocate
      9393 13.44% 56.96%       9930 14.21%  std::make_unique
      5149  7.37% 64.32%       9331 13.35%  CBnRunTimePrototype::CBnRunTimePrototype
      5075  7.26% 71.59%       5075  7.26%  CDatabase::CDatabase
       825  1.18% 82.42%       4857  6.95%  CBnPropertyContents::CBnPropertyContents

CBnPropertyContents::CBnPropertyContents rising slowly, but now CDatabase::CDatabase dropped suddenly down by a difference of 222... :-(

Is this correct behavior? Did I misunderstand the -alloc_objects/-alloc_space option?

I can send more data to you if you want. Thanks much.

I suspect this is using https://github.com/gperftools/gperftools, not https://github.com/google/tcmalloc? I think the issue needs to go to those repos anyway as it's not pprof related.