API usage questions.
codecnotsupported opened this issue · 6 comments
I'm trying to build my own fuzzer using the perf_event_open API for capturing the bitstream as a fun side project.
I've got a lot of questions regarding library usage, partially due to lacking background information. (I've sparsely read the Intel manual on the subject)
Let's start by library basic usage as far as I understand it.
Create decoder -> Run Decoder -> decoder fills bitmap according to the trace data. Where 1 means branch taken and 0 means not taken.
That much is clear to me. What is considered a branch is (what I presume) left to the capture configuration.
Though the arguments could be clarified a bit more as I needed to check the source what for what they meant.
libxdc_init:
-
Filter: Are these absolute and/or physical addresses or virtual?
- What happens if you capture a guest VM from the host, is the address of the host or the guest?
- What if you don't filter by address at record time but just by CR3, do you simply take the whole address range?
-
page_cache_fetch: The purpose of page_cache_fetch is to fetch data from the fuzzed target. However argument names, purpose of the arguments are left out, so I'm guessing what going on based on the tests.
- void* self_ptr, uint64_t page, bool* success
- self_ptr: opaque pointer
- page: pointer to data it requests? (What is the size? Size of a page of the traced process?)
- success: Was the memory fetch successful?
- Return value: The address to access the data?
- void* self_ptr, uint64_t page, bool* success
libxdc_decode: Pretty clear.
- decoder: libxdc_init context
- trace: The recorded intel pt trace bitstream
- trace_size: Size of the trace (+ 1 for the 0x55 byte)
- Return value: Status code
libxdc_register_bb_callback: called for each NEW basic block e.g. Function calls?
- void* opaque_ptr,
- uint64_t start_addr: Source address
- uint64_t cofi_addr: Destination address
libxdc_register_edge_callback: Called on each branch e.g. If statements, switch case,?
- void* opaque_ptr:
- uint64_t src: Source address
- uint64_t dst: Destination address
I made a simple test case but I haven't found any success so far
Which might be due to two things, in-proper capture set-up, in-proper library usage, both?
It's relatively small (200 ~LOC), I can post it if you want to take a look. ( Need to clean it up first though )
Here's is the log with DEBUG_TRACES enabled. The bottom is the cache page_cache_fetch requests.
PSB
MODE
MODE
FUP ffffffff98a76314 (TNT: 0)
VMCS
PSBEND
PGE ffffffff98a76316 (TNT: 0)
TIP ffffffff98a16360 (TNT: 0)
TIP ffffffff98a1662c (TNT: 0)
TIP ffffffff98a166cd (TNT: 0)
TIP ffffffff98c15779 (TNT: 0)
TNT a6
TNT 4
TIP ffffffff98c15970 (TNT: 7)
disasm(ffffffff98c15779,0) TNT: 7
TNT e
TIP ffffffff98c17ee5 (TNT: 9)
disasm(ffffffff98c15970,0) TNT: 9
TNT 14
TIP ffffffff98c120f0 (TNT: 12)
disasm(ffffffff98c17ee5,0) TNT: 12
TNT 1a
TIP ffffffff98c18282 (TNT: 15)
disasm(ffffffff98c120f0,0) TNT: 15
TNT 96
TNT 2c
TIP ffffffff98c18683 (TNT: 25)
disasm(ffffffff98c18282,0) TNT: 25
TNT 4
TIP ffffffff98c18771 (TNT: 26)
disasm(ffffffff98c18683,0) TNT: 26
TIP ffffffff98c18814 (TNT: 26)
disasm(ffffffff98c18771,0) TNT: 26
TNT 4
TIP ffffffff98c15970 (TNT: 27)
disasm(ffffffff98c18814,0) TNT: 27
TNT e
TIP ffffffff98c18b0e (TNT: 29)
disasm(ffffffff98c15970,0) TNT: 29
TIP ffffffff98c19dec (TNT: 29)
disasm(ffffffff98c18b0e,0) TNT: 29
TNT 4
TIP ffffffff98c133f3 (TNT: 30)
disasm(ffffffff98c19dec,0) TNT: 30
TIP ffffffff98b489ae (TNT: 30)
disasm(ffffffff98c133f3,0) TNT: 30
TIP ffffffff98b48abc (TNT: 30)
disasm(ffffffff98b489ae,0) TNT: 30
TNT c
TIP ffffffff98c1257d (TNT: 32)
disasm(ffffffff98b48abc,0) TNT: 32
TNT 8
TIP ffffffff98c1a84c (TNT: 34)
disasm(ffffffff98c1257d,0) TNT: 34
TNT 8
TIP ffffffff98c1aa54 (TNT: 36)
disasm(ffffffff98c1a84c,0) TNT: 36
TIP ffffffff98c1228a (TNT: 36)
disasm(ffffffff98c1aa54,0) TNT: 36
TNT 1e
TIP ffffffff98c202e8 (TNT: 39)
disasm(ffffffff98c1228a,0) TNT: 39
TNT 4
TIP ffffffff98c206c3 (TNT: 40)
disasm(ffffffff98c202e8,0) TNT: 40
TNT 72
TIP ffffffff98d057ad (TNT: 45)
disasm(ffffffff98c206c3,0) TNT: 45
TNT c
TIP ffffffff98d057fa (TNT: 47)
disasm(ffffffff98d057ad,0) TNT: 47
TIP ffffffff99571fe9 (TNT: 47)
disasm(ffffffff98d057fa,0) TNT: 47
TNT a2
TNT 56
TIP ffffffff9960008c (TNT: 58)
disasm(ffffffff99571fe9,0) TNT: 58
TNT 80
TIP 7ffff7b7750b (TNT: 64)
disasm(ffffffff9960008c,0) TNT: 64
TNT 4
TIP 5555557217d8 (TNT: 65)
disasm(7ffff7b7750b,0) TNT: 65
opaque_ptr: 0
page: 0x7ffff7b7750b
success: 1
opaque_ptr: 0
page: 0x7ffff7b7750c
success: 1
opaque_ptr: 0
page: 0x7ffff7b7750f
success: 1
opaque_ptr: 0
page: 0x7ffff7b77511
success: 1
opaque_ptr: 0
page: 0x7ffff7b77515
success: 1
opaque_ptr: 0
page: 0x7ffff7b77517
success: 1
opaque_ptr: 0
page: 0x7ffff7b7751c
success: 1
decoder_page_fault
It seems like you are on the right track in general. The bitmap is actually a byte-map where each byte contains the number of times an edge was taken % 0xff. The filter have to be configured in the same way intel-pt was configured. If the whole address space was traced (e.g. by filtering on CR3) you will have to set the filter to contain the whole range. Your inference on the arguments of page_cache_fetch seem on point. Register BB/Edge callbacks are not needed to fill the bitmap, however they can be useful if you want to print coverage for human consumption.
How does the log differ from what you expect?
Thanks for the quick reply.
The bitmap is actually a byte-map where each byte contains the number of times an edge was taken % 0xff.
I bit confused how to utilize the byte map, for example how do you reconstruct the ip from the byte map?
Also what do you mean with 0xff "was taken % 0xff"?
EDIT: Percentage in base 16?
How does the log differ from what you expect?
Well the return code of libxdc_decode
isn't decoder_result_s::decoder_success
but decoder_result_s::decoder_page_fault
(as seen in the log) so I presume I'm doing something wrong.
Also the bitmap contents are empty.
Say you have 4 basic blocks A,B,C,D that are connected in a diamond shape with a loot:
A->B->D->A and A->C->D. The program takes the edges A->B->D->A 5 times and the edges A->C->D once. Then the bitmap will be as follows:
bitmap[A<<1^B] = 5
bitmap[B<<1^C] = 5
bitmap[C<<1^D] = 5
bitmap[D<<1^A] = 5
bitmap[A<<1^C] = 1
bitmap[C<<1^D] = 1
and all other entries are zero. See the afl whitepaper for more details. Notably, if an edge was taken more than 255 times, the corresponding entry overflows. As the index is computed by taking (last_bb_id << 1) xor (curr_bb_id), it's not possible to compute the edges taken from the bitmap itself. If you need access to the edges, you will need to use the callbacks. The bitmap is faster to check for new coverage though. As this is our only concern during fuzzing, is better to use the bitmap for fuzzing purposes. The edge callbacks can be used to display which code was covered during debugging/harness authoring though.
If you are seeing a page_fault, I would assume you page cache is not handling the memory accesses correctly? Usually that would indicate that the decoder is trying to disassemble code that the page cache is unable to provide. Make sure that your memory dump contains all addresses that you traced (or limit your tracing to the are that you can disassemble).
I vaguely understand how it works, I'll have to get my hands dirty to get a complete understanding and read that white paper. By which I presume you mean https://aflplus.plus/papers/aflpp-woot2020.pdf.
That said, how do you know which edge(BB) connections belongs to which index in the bitmap?
not handling the memory accesses correctly
In my test code, the traced function is within the same process so I'd assume that the page is the same and the page_cache_fetch function would look like this:
void *page_cache_fetch(void *opaque_ptr, uint64_t page, bool *success) {
*success = true;
return (void *)page;
}
I looked at the last point where it fails and it seems to be
opaque_ptr: 0
page: 0x7ffff7b77517
success: 1
opaque_ptr: 0
page: 0x7ffff7b7751c
success: 1
decoder_page_fault
I kind of expected a segfault, if it truly wasn't able to fetch the data.
Debugger:
disassemble -s 0x7ffff7b7751c
libc.so.6`ioctl:
0x7ffff7b7751c <+28>: fsubs 0x1(%rcx,%rcx,4)
0x7ffff7b77520 <+32>: orq $-0x1, %rax
0x7ffff7b77524 <+36>: retq
0x7ffff7b77525: nopw %cs:(%rax,%rax)
0x7ffff7b7752f: nop
I use ioctl (the function above) to enable and disable tracing.
It requests the memory just fine. Which is rather strange.
I presume that the trace was inadequately ended, when I disabled tracing -> kernel call(ioctl) -> end trace.
If I set-up filters, or trace a external process I probably won't have this issue.
I was referring to the original whitepaper (https://lcamtuf.coredump.cx/afl/technical_details.txt). Generally you can't map the bitmap indices to edges. If you care about edges, use the callbacks. Page should probably be PAGESIZE aligned (e.g. 0x1000, like https://github.com/nyx-fuzz/libxdc/blob/master/test/page_cache.c#L168).