Symbolization seems to fail with buildID mismatch error when a profiling target application is run by invoking a linker directly
Closed this issue · 1 comments
What version of pprof are you using?
What operating system and processor architecture are you using?
CPU: amd64
Host OS: Linux 50e76dec012c 5.15.153.1-microsoft-standard-WSL2
And running images like debian:11
, rockylinux:8
. Details are explained below.
What did you do?
Our app is written in Rust and we're using rust-jemalloc-pprof to get a heap profile in pprof format. The pprof data is not symbolized so we're providing the exact same binary and *so files to pprof and trying to get a symbolized profiling data but failing.
The root cause seems to be that the app is run by invoking a linker directly like entrypoint: ["./dylib/ld-linux-x86-64.so.2", "--library-path", "./dylib", "./rust-pprof-test"]
. The reason for doing this is that, due to some internal constraints, we build the Rust app in debian:11
and copy the binary and all the dependencies to rocky:8
and run the app directly specifying the linker and dependencies. I understand that this is a very irregular situation, but is there any way to obtain symbolized profiling data in such a scenario?
Steps to reproduce:
- Clone this repo https://github.com/ykadowak/rust-pprof-test on an Linux/amd64 machine
- Run these commands in the root directory of the repository:
docker compose up -d
docker compose exec rocky /bin/bash
# inside rockylinux:8 image
curl localhost:3000/debug/pprof/heap > heap.pb.gz
pprof ./heap.pb.gz # cannot symbolize
# you can also try things like
PPROF_BINARY_PATH=./dylib PPROF_TOOLS=./binutils pprof ./rust-pprof-test ./heap.pb.gz
exit
docker compose exec debian /bin/bash
# inside debian:11 image
curl localhost:3000/debug/pprof/heap > heap.pb.gz
pprof ./heap.pb.gz # cannot symbolize
exit
# Uncomment this line: https://github.com/ykadowak/rust-pprof-test/blob/f535eff2268dbf23fa304f851c95c4e275627386/compose.yaml#L16 to run the app directly.
docker compose stop
docker compose up -d
docker compose exec debian /bin/bash
# inside debian:11 image
curl localhost:3000/debug/pprof/heap > heap.pb.gz
pprof ./heap.pb.gz # can symbolize this time
What did you expect to see?
Symbolized result shows up.
What did you see instead?
When running the app by invoking a linker directly, a build ID mismatch error occurs like below.
[root@33b7bd08a8e5 app]# pprof -raw ./heap.pb.gz
Local symbolization failed for ld-linux-x86-64.so.2 (build ID 7914137f6c04cbb6c7ec4ecb6295b5462c4a6c65): build ID mismatch
Comment: executableInfo=3;0;0
Comment: executableInfo=3;1c000;1c000
Comment: executableInfo=3;1c2000;1c2000
Comment: executableInfo=3;2280b8;2290b8
Comment: executableInfo=3;0;0
Comment: executableInfo=3;0;0
Comment: executableInfo=3;3000;3000
Comment: executableInfo=3;14000;14000
Comment: executableInfo=3;17dc8;18dc8
Comment: executableInfo=3;0;0
Comment: executableInfo=3;6000;6000
Comment: executableInfo=3;16000;16000
Comment: executableInfo=3;1bc08;1cc08
Comment: executableInfo=3;0;0
Comment: executableInfo=3;d000;d000
Comment: executableInfo=3;a7000;a7000
Comment: executableInfo=3;141d80;142d80
Comment: executableInfo=3;0;0
Comment: executableInfo=3;1000;1000
Comment: executableInfo=3;3000;3000
Comment: executableInfo=3;3d70;4d70
Comment: executableInfo=3;0;0
Comment: executableInfo=3;22000;22000
Comment: executableInfo=3;17b000;17b000
Comment: executableInfo=3;1c9768;1ca768
Comment: executableInfo=3;0;0
Comment: executableInfo=3;1000;1000
Comment: executableInfo=3;21000;21000
Comment: executableInfo=3;294c0;2a4c0
PeriodType: space bytes
Period: 0
Time: 2024-08-25 15:45:13.383007519 +0000 UTC
Samples:
inuse_space/bytes
4195715: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Locations
1: 0x7f48fb4b86ac M=2
2: 0x7f48fb4b8bbc M=2
3: 0x7f48fb4aa394 M=2
4: 0x7f48fb43752f M=2
5: 0x7f48fb4051aa M=2
6: 0x7f48fb405483 M=2
7: 0x7f48fb3f006a M=2
8: 0x7f48fb3ecbd3 M=2
9: 0x7f48fb3f163a M=2
10: 0x7f48fb3f8d02 M=2
11: 0x7f48fb3f5f88 M=2
12: 0x7f48fb53284c M=2
13: 0x7f48fb3f181b M=2
14: 0x7f48fb06dd09 M=23 __libc_start_main ??:?:0:0 s=0
15: 0x7f48fb3dfec9 M=2
Mappings
1: 0x7f48fb3a6000/0x7f48fb3c17d0/0x0 /app/dylib/ld-linux-x86-64.so.2 7914137f6c04cbb6c7ec4ecb6295b5462c4a6c65
2: 0x7f48fb3c2000/0x7f48fb5674f9/0x1c000 /app/dylib/ld-linux-x86-64.so.2 7914137f6c04cbb6c7ec4ecb6295b5462c4a6c65
3: 0x7f48fb568000/0x7f48fb5cd8f0/0x1c2000 /app/dylib/ld-linux-x86-64.so.2 7914137f6c04cbb6c7ec4ecb6295b5462c4a6c65
4: 0x7f48fb5cf0b8/0x7f48fb8019e0/0x2280b8 /app/dylib/ld-linux-x86-64.so.2 7914137f6c04cbb6c7ec4ecb6295b5462c4a6c65
5: 0x7ffd35996000/0x7ffd35996ce5/0x0 linux-vdso.so.1 f4c596200ed8d0e245960ef7d54a281f43f20530
6: 0x7f48fb38a000/0x7f48fb38c898/0x0 ./dylib/libgcc_s.so.1 596409bc4e94583ef18f141c9b941a46540868ee
7: 0x7f48fb38d000/0x7f48fb39db69/0x3000 ./dylib/libgcc_s.so.1 596409bc4e94583ef18f141c9b941a46540868ee
8: 0x7f48fb39e000/0x7f48fb3a131c/0x14000 ./dylib/libgcc_s.so.1 596409bc4e94583ef18f141c9b941a46540868ee
9: 0x7f48fb3a2dc8/0x7f48fb3a3448/0x17dc8 ./dylib/libgcc_s.so.1 596409bc4e94583ef18f141c9b941a46540868ee
10: 0x7f48fb368000/0x7f48fb36d9e8/0x0 ./dylib/libpthread.so.0 255e355c207aba91a59ae1f808e3b4da443abf0c
11: 0x7f48fb36e000/0x7f48fb37d0ad/0x6000 ./dylib/libpthread.so.0 255e355c207aba91a59ae1f808e3b4da443abf0c
12: 0x7f48fb37e000/0x7f48fb3837d4/0x16000 ./dylib/libpthread.so.0 255e355c207aba91a59ae1f808e3b4da443abf0c
13: 0x7f48fb384c08/0x7f48fb389470/0x1bc08 ./dylib/libpthread.so.0 255e355c207aba91a59ae1f808e3b4da443abf0c
14: 0x7f48fb224000/0x7f48fb230278/0x0 ./dylib/libm.so.6 1d6ff6c4c69f3572486bc27b8290ee932b0b9f39
15: 0x7f48fb231000/0x7f48fb2caca1/0xd000 ./dylib/libm.so.6 1d6ff6c4c69f3572486bc27b8290ee932b0b9f39
16: 0x7f48fb2cb000/0x7f48fb3652c4/0xa7000 ./dylib/libm.so.6 1d6ff6c4c69f3572486bc27b8290ee932b0b9f39
17: 0x7f48fb366d80/0x7f48fb367110/0x141d80 ./dylib/libm.so.6 1d6ff6c4c69f3572486bc27b8290ee932b0b9f39
18: 0x7f48fb21e000/0x7f48fb21edb8/0x0 ./dylib/libdl.so.2 46b3bf3f9b9eb092a5c0cf5575e89092f768054c
19: 0x7f48fb21f000/0x7f48fb220051/0x1000 ./dylib/libdl.so.2 46b3bf3f9b9eb092a5c0cf5575e89092f768054c
20: 0x7f48fb221000/0x7f48fb2216e8/0x3000 ./dylib/libdl.so.2 46b3bf3f9b9eb092a5c0cf5575e89092f768054c
21: 0x7f48fb222d70/0x7f48fb223110/0x3d70 ./dylib/libdl.so.2 46b3bf3f9b9eb092a5c0cf5575e89092f768054c
22: 0x7f48fb04a000/0x7f48fb06b488/0x0 ./dylib/libc.so.6 2b86a1968781038c0766b17c1ea11a2a71d7d907
23: 0x7f48fb06c000/0x7f48fb1c4ecc/0x22000 ./dylib/libc.so.6 2b86a1968781038c0766b17c1ea11a2a71d7d907 [FN][FL][IN]
24: 0x7f48fb1c5000/0x7f48fb2135f4/0x17b000 ./dylib/libc.so.6 2b86a1968781038c0766b17c1ea11a2a71d7d907
25: 0x7f48fb214768/0x7f48fb21d680/0x1c9768 ./dylib/libc.so.6 2b86a1968781038c0766b17c1ea11a2a71d7d907
26: 0x7f48fb802000/0x7f48fb802f68/0x0 ./dylib/ld-linux-x86-64.so.2 1b3277a419c3fa42b199e5a170ea215b32689793
27: 0x7f48fb803000/0x7f48fb8222d0/0x1000 ./dylib/ld-linux-x86-64.so.2 1b3277a419c3fa42b199e5a170ea215b32689793
28: 0x7f48fb823000/0x7f48fb82aca4/0x21000 ./dylib/ld-linux-x86-64.so.2 1b3277a419c3fa42b199e5a170ea215b32689793
29: 0x7f48fb82c4c0/0x7f48fb82e178/0x294c0 ./dylib/ld-linux-x86-64.so.2 1b3277a419c3fa42b199e5a170ea215b32689793
After more research, it seems that the root cause is rust-jemalloc-pprof
wrongly constructs the Mapping field in our use case. Sorry for the fuss.