intel/libipt

Exact behavior of the IPT circular buffer

BigJim opened this issue · 3 comments

Again trying to eliminate sources of various decoding errors I rarely but occasionally get, in the Intel manual under "35.2.6 Trace Output" it says:

This output range is circular, meaning that when the writes wrap around the end of the buffer they begin again at the base address.

Reading this, and any other mentions of the buffer the exact details of the wrap around is not clear.

When I go to poll the buffer (using multiple or single region buffer setup) tracing is disabled to get a correct current proc_trace_output_offset read. I know to read (copy from the buffer) from the last offset to the next. I can detect when a wrap around happens because the new offset will be lessor than the last poll. But now uncertain if the buffer just wraps around and do I need to make two reads (one from the last PTO to the end/right-edge of the buffer, and second then one from offset 0/left-edge appended together); or does the CPU see the wrap around ahead of time and just resets from left-edge/base to offset again?

In other words like on a number line:
A typical read..
|---X--------------Y-------|
Where the 'X is the previous offset read, and 'Y' is the new/current one.
I know to read from the last proc_trace_output_offset (the last read's 'X') to this new 'Y', no problem this seems to be correct.

Now lets say I make a second read I get:
Wrapped around read..
|---Y--------------X-------|
I see now that 'Y' (the new offset) is below the previous offset 'X' read.
This tells me a buffer wrap around occurred since my last poll .
Do I need to do two buffer reads: first the "X-------|" region followed by the "|---Y" (appended), or does the CPU reset to the base and now I just need to read from the "|---Y" region alone?

Pardon my ignorance, I did study how circular/ring buffers work but it's a bit ambiguous how it works for IPT just going by what the manual says.

On Linux, using perf_event_open, or when programming the h/w directly, you'd read X..end and begin..Y and concatenate the two.

Note that when X < Y you may still have gotten a wrap-around.

You may want to look at the perf user-space tool sources. They are located in the kernel tree in tools/perf/.

Great idea. That should work.
Vielen Dank!

It's too bad the IPT scheme doesn't include some sort of per PSB block checksum. Even a simple single byte sum type, CRC8 or CRC16, etc.
Then one could verify the integrity of each of these blocks, and if a developer had something wrong it would be obvious right away.

Edit:
Okay it is the simple case where it just wraps around at the end.
Assuming one uses a simple ToPA scheme where all buffer parts are the same size, or easier (with more overhead apparently) with a single buffer scheme, when the CPU sees an END marker it wraps around and RTIT_OUTPUT_MASK_PTRS MaskOrTableOffset and OutputOffset are reset to zero again and the write continues from there.

Note in the manual it calls the ToPA setup a "linked list of tables". It's easier to think of this table as a simple linear array with 8byte (single 64bit value) elements. When using a multi-region
buffer scheme (IA32_RTIT_CTL.ToPA = 1) MaskOrTableOffset is just an index into this array.
I.E. If you have 8 ToPA buffers with the last element having the END bit set you will see MaskOrTableOffset go from 0 to 7 and wrap around back to index 0 again.

In reading the buffer to make it an easy calculation you will want to have all your ToPA buffer sizes be the same. So your current read into the buffer for polling schemes will be our current read position into the buffer is: (MaskOrTableOffset * size of TopA buffer) + OutputOffset
Then to get the size of read you track the previous position to the current one looking for the wrap condition described above.