intel/libipt

How to trace the library code

hdhyun216 opened this issue · 4 comments

I want to acquire Intel PT trace for library code in executable file. However, only the prologue and epilogue of the executable file and the code entered directly by the developer are only available for the PT trace. When calling a library function from an executable file, you cannot acquire a trace for that function code.
The following assembly is the disassembled code of the executable with ptxed, which calls only the printf() function and terminates.

...
0000000000400526 push rbp
0000000000400527 mov rbp, rsp
000000000040052a mov edi, 0x4005c4
000000000040052f mov eax, 0x0
0000000000400534 call 0x400400
0000000000400400 jmp qword ptr [rip+0x200c12]
0000000000400406 push 0x0
000000000040040b jmp 0x4003f0
00000000004003f0 push qword ptr [rip+0x200c12]
00000000004003f6 jmp qword ptr [rip+0x200c14]
...

As you can see, there is no code for printf(), a function of the standard input/output library.
Even if not disassembly, the PT packet traces could not find any corresponding part of the library function. How can I get the PT trace of the library code?

p.s. Is there any way to get the PT trace of the python code?

Unless you use any filtering, PT traces all instructions that were executed (well, it only records branches). You could try that with GDB, for example. Set a breakpoint on printf(), start recording "(gdb) record pt", and continue to that breakpoint. If you step out of printf(), you should see all the printf() code getting recorded.

Tracing python is a bit more tricky since python is JIT-compiled. PT traces all the code that was executed, but you need support in the JIT-compiler to preserve all the pages it generated code into. Otherwise, if you try to decode the trace later on, you wouldn't have the binary code available and decode would fail.

Yes, as you informed, gdb confirmed that PT also tracks library code.
I rebuild the newer version of gdb and checked it.
So, now I can see the disassembly of library function with 'instruction-history' of gdb function.
But still, can't find the PT packets of library function in the perf.data.
Can you see how I did and tell me what's wrong?
Here are the perf command that I used. (The name of executable is 'hello', which only calls printf() and terminates)

$ sudo perf record -e intel_pt//u ./hello
$ sudo perf script -D > hello.dump

At this point, I can see the PT packet dump of the executable. But, It is really hard to recognize where the printf() packet is. (Even hard to find where the main() started and ended. I don't know how to see the trace result of only main() and library that main() called.) Therefore, to check that library code is traced, I obtained the disassembly from hello.dump using ptxed.

$ sudo libipt/script/perf-read-aux.bash
$ sudo libipt/script/perf-read-sideband.bash

The above two scripts earned one 'perf.data-aux-idx[n].data' and four 'perf.data-sideband.cpu[n].data'. The following ptxed commands were then entered using one aux and one corresponding sideband.

$ libipt/bin/ptxed $(sudo libipt/script/perf-get-opts.bash -m perf.data-sideband-cpu1.pevent) --pevent:vdso-x32 --event:tick --pt perf.data-aux-idx1.bin > helloptxed.txt

Then I got the assemblys that I mentioned in the first comment.
As you can see, there is no instructions of printf().
Through this I concluded that packets for the library code were not traced.
Is there something wrong with the commands or something I haven't found?

The "--pevent:vdso-x32" takes a filename as argument that points to the vdso library for the x32 architecture. Perf dumps the vdso and stores it in its cache. You can either point ptxed to that file or use the perf-copy-mapped-files.bash script to copy all files referenced in the trace including the vdso. This is recommended if you want to decode on a different system or after updating your system.

Watch out for any errors or warnings in the ptxed or ptdump output.

I don't see how one could tell from the above disassembly that the code for printf() is not included in the trace.

Because the disassembly area mentioned above was the only main() range in disassembly, I thought that if a library trace exists, the library assembly should come out of the middle of that area.

I think I should specify the filename of the vdso option as you indicated.
Thank you for your kind explanation.