nyx-fuzz/libxdc

API usage questions.

codecnotsupported opened this issue · 6 comments

I'm trying to build my own fuzzer using the perf_event_open API for capturing the bitstream as a fun side project.
I've got a lot of questions regarding library usage, partially due to lacking background information. (I've sparsely read the Intel manual on the subject)

Let's start by library basic usage as far as I understand it.
Create decoder -> Run Decoder -> decoder fills bitmap according to the trace data. Where 1 means branch taken and 0 means not taken.
That much is clear to me. What is considered a branch is (what I presume) left to the capture configuration.

Though the arguments could be clarified a bit more as I needed to check the source what for what they meant.

libxdc_init:

  • Filter: Are these absolute and/or physical addresses or virtual?

    • What happens if you capture a guest VM from the host, is the address of the host or the guest?
    • What if you don't filter by address at record time but just by CR3, do you simply take the whole address range?
  • page_cache_fetch: The purpose of page_cache_fetch is to fetch data from the fuzzed target. However argument names, purpose of the arguments are left out, so I'm guessing what going on based on the tests.

    • void* self_ptr, uint64_t page, bool* success
      • self_ptr: opaque pointer
      • page: pointer to data it requests? (What is the size? Size of a page of the traced process?)
      • success: Was the memory fetch successful?
      • Return value: The address to access the data?

libxdc_decode: Pretty clear.

  • decoder: libxdc_init context
  • trace: The recorded intel pt trace bitstream
  • trace_size: Size of the trace (+ 1 for the 0x55 byte)
  • Return value: Status code

libxdc_register_bb_callback: called for each NEW basic block e.g. Function calls?

  • void* opaque_ptr,
  • uint64_t start_addr: Source address
  • uint64_t cofi_addr: Destination address

libxdc_register_edge_callback: Called on each branch e.g. If statements, switch case,?

  • void* opaque_ptr:
  • uint64_t src: Source address
  • uint64_t dst: Destination address

I made a simple test case but I haven't found any success so far
Which might be due to two things, in-proper capture set-up, in-proper library usage, both?
It's relatively small (200 ~LOC), I can post it if you want to take a look. ( Need to clean it up first though )

Here's is the log with DEBUG_TRACES enabled. The bottom is the cache page_cache_fetch requests.

PSB
MODE
MODE
FUP     ffffffff98a76314 (TNT: 0)
VMCS
PSBEND
PGE     ffffffff98a76316 (TNT: 0)
TIP     ffffffff98a16360 (TNT: 0)
TIP     ffffffff98a1662c (TNT: 0)
TIP     ffffffff98a166cd (TNT: 0)
TIP     ffffffff98c15779 (TNT: 0)
TNT a6
TNT 4
TIP     ffffffff98c15970 (TNT: 7)


disasm(ffffffff98c15779,0)      TNT: 7
TNT e
TIP     ffffffff98c17ee5 (TNT: 9)


disasm(ffffffff98c15970,0)      TNT: 9
TNT 14
TIP     ffffffff98c120f0 (TNT: 12)


disasm(ffffffff98c17ee5,0)      TNT: 12
TNT 1a
TIP     ffffffff98c18282 (TNT: 15)


disasm(ffffffff98c120f0,0)      TNT: 15
TNT 96
TNT 2c
TIP     ffffffff98c18683 (TNT: 25)


disasm(ffffffff98c18282,0)      TNT: 25
TNT 4
TIP     ffffffff98c18771 (TNT: 26)


disasm(ffffffff98c18683,0)      TNT: 26
TIP     ffffffff98c18814 (TNT: 26)


disasm(ffffffff98c18771,0)      TNT: 26
TNT 4
TIP     ffffffff98c15970 (TNT: 27)


disasm(ffffffff98c18814,0)      TNT: 27
TNT e
TIP     ffffffff98c18b0e (TNT: 29)


disasm(ffffffff98c15970,0)      TNT: 29
TIP     ffffffff98c19dec (TNT: 29)


disasm(ffffffff98c18b0e,0)      TNT: 29
TNT 4
TIP     ffffffff98c133f3 (TNT: 30)


disasm(ffffffff98c19dec,0)      TNT: 30
TIP     ffffffff98b489ae (TNT: 30)


disasm(ffffffff98c133f3,0)      TNT: 30
TIP     ffffffff98b48abc (TNT: 30)


disasm(ffffffff98b489ae,0)      TNT: 30
TNT c
TIP     ffffffff98c1257d (TNT: 32)


disasm(ffffffff98b48abc,0)      TNT: 32
TNT 8
TIP     ffffffff98c1a84c (TNT: 34)


disasm(ffffffff98c1257d,0)      TNT: 34
TNT 8
TIP     ffffffff98c1aa54 (TNT: 36)


disasm(ffffffff98c1a84c,0)      TNT: 36
TIP     ffffffff98c1228a (TNT: 36)


disasm(ffffffff98c1aa54,0)      TNT: 36
TNT 1e
TIP     ffffffff98c202e8 (TNT: 39)


disasm(ffffffff98c1228a,0)      TNT: 39
TNT 4
TIP     ffffffff98c206c3 (TNT: 40)


disasm(ffffffff98c202e8,0)      TNT: 40
TNT 72
TIP     ffffffff98d057ad (TNT: 45)


disasm(ffffffff98c206c3,0)      TNT: 45
TNT c
TIP     ffffffff98d057fa (TNT: 47)


disasm(ffffffff98d057ad,0)      TNT: 47
TIP     ffffffff99571fe9 (TNT: 47)


disasm(ffffffff98d057fa,0)      TNT: 47
TNT a2
TNT 56
TIP     ffffffff9960008c (TNT: 58)


disasm(ffffffff99571fe9,0)      TNT: 58
TNT 80
TIP     7ffff7b7750b (TNT: 64)


disasm(ffffffff9960008c,0)      TNT: 64
TNT 4
TIP     5555557217d8 (TNT: 65)


disasm(7ffff7b7750b,0)  TNT: 65

opaque_ptr:     0
page:   0x7ffff7b7750b
success:        1

opaque_ptr:     0
page:   0x7ffff7b7750c
success:        1

opaque_ptr:     0
page:   0x7ffff7b7750f
success:        1

opaque_ptr:     0
page:   0x7ffff7b77511
success:        1

opaque_ptr:     0
page:   0x7ffff7b77515
success:        1

opaque_ptr:     0
page:   0x7ffff7b77517
success:        1

opaque_ptr:     0
page:   0x7ffff7b7751c
success:        1
decoder_page_fault
eqv commented

It seems like you are on the right track in general. The bitmap is actually a byte-map where each byte contains the number of times an edge was taken % 0xff. The filter have to be configured in the same way intel-pt was configured. If the whole address space was traced (e.g. by filtering on CR3) you will have to set the filter to contain the whole range. Your inference on the arguments of page_cache_fetch seem on point. Register BB/Edge callbacks are not needed to fill the bitmap, however they can be useful if you want to print coverage for human consumption.

How does the log differ from what you expect?

Thanks for the quick reply.

The bitmap is actually a byte-map where each byte contains the number of times an edge was taken % 0xff.

I bit confused how to utilize the byte map, for example how do you reconstruct the ip from the byte map?
Also what do you mean with 0xff "was taken % 0xff"?
EDIT: Percentage in base 16?

How does the log differ from what you expect?

Well the return code of libxdc_decode isn't decoder_result_s::decoder_success
but decoder_result_s::decoder_page_fault (as seen in the log) so I presume I'm doing something wrong.
Also the bitmap contents are empty.

eqv commented

Say you have 4 basic blocks A,B,C,D that are connected in a diamond shape with a loot:

A->B->D->A and A->C->D. The program takes the edges A->B->D->A 5 times and the edges A->C->D once. Then the bitmap will be as follows:

bitmap[A<<1^B] = 5
bitmap[B<<1^C] = 5
bitmap[C<<1^D] = 5
bitmap[D<<1^A] = 5
bitmap[A<<1^C] = 1
bitmap[C<<1^D] = 1

and all other entries are zero. See the afl whitepaper for more details. Notably, if an edge was taken more than 255 times, the corresponding entry overflows. As the index is computed by taking (last_bb_id << 1) xor (curr_bb_id), it's not possible to compute the edges taken from the bitmap itself. If you need access to the edges, you will need to use the callbacks. The bitmap is faster to check for new coverage though. As this is our only concern during fuzzing, is better to use the bitmap for fuzzing purposes. The edge callbacks can be used to display which code was covered during debugging/harness authoring though.

If you are seeing a page_fault, I would assume you page cache is not handling the memory accesses correctly? Usually that would indicate that the decoder is trying to disassemble code that the page cache is unable to provide. Make sure that your memory dump contains all addresses that you traced (or limit your tracing to the are that you can disassemble).

I vaguely understand how it works, I'll have to get my hands dirty to get a complete understanding and read that white paper. By which I presume you mean https://aflplus.plus/papers/aflpp-woot2020.pdf.
That said, how do you know which edge(BB) connections belongs to which index in the bitmap?

not handling the memory accesses correctly

In my test code, the traced function is within the same process so I'd assume that the page is the same and the page_cache_fetch function would look like this:

void *page_cache_fetch(void *opaque_ptr, uint64_t page, bool *success) {
  *success = true;
  return (void *)page;
}

I looked at the last point where it fails and it seems to be

opaque_ptr:     0
page:   0x7ffff7b77517
success:        1

opaque_ptr:     0
page:   0x7ffff7b7751c
success:        1
decoder_page_fault

I kind of expected a segfault, if it truly wasn't able to fetch the data.

Debugger:


disassemble -s 0x7ffff7b7751c
libc.so.6`ioctl:
    0x7ffff7b7751c <+28>: fsubs  0x1(%rcx,%rcx,4)
    0x7ffff7b77520 <+32>: orq    $-0x1, %rax
    0x7ffff7b77524 <+36>: retq   
    0x7ffff7b77525:       nopw   %cs:(%rax,%rax)
    0x7ffff7b7752f:       nop  

I use ioctl (the function above) to enable and disable tracing.
It requests the memory just fine. Which is rather strange.
I presume that the trace was inadequately ended, when I disabled tracing -> kernel call(ioctl) -> end trace.
If I set-up filters, or trace a external process I probably won't have this issue.

eqv commented

I was referring to the original whitepaper (https://lcamtuf.coredump.cx/afl/technical_details.txt). Generally you can't map the bitmap indices to edges. If you care about edges, use the callbacks. Page should probably be PAGESIZE aligned (e.g. 0x1000, like https://github.com/nyx-fuzz/libxdc/blob/master/test/page_cache.c#L168).