/wtf

wtf is a distributed, code-coverage guided, customizable, cross-platform snapshot-based fuzzer designed for attacking user and / or kernel-mode targets running on Microsoft Windows.

Primary LanguageC++MIT LicenseMIT

what the fuzz

Builds

Overview

what the fuzz or wtf is a distributed, code-coverage guided, customizable, cross-platform snapshot-based fuzzer designed for attacking user and or kernel-mode targets running on Microsoft Windows. Execution of the target can be done inside an emulator with bochscpu (slowest, most precise), inside a Windows VM with the Windows Hypervisor Platform APIs or inside a Linux VM with the KVM APIs (fastest).

It uncovered memory corruption vulnerabilities in a wide range of softwares: IDA Pro, a popular AAA game, the Windows kernel, HEVD secure mode, etc.

Compiled binaries are available from either the CI artifacts or from the Releases section for both Windows & Linux.

If you would like to read more about its history or how to use it on a real target, check out both Building a new snapshot fuzzer & fuzzing IDA and Fuzzing Modern UDP Game Protocols With Snapshot-based Fuzzers.

Special thanks to @yrp604 for providing valuable inputs throughout the project and @masthoon for suggesting to write a demo targeting HEVD secure mode.

Usage

The best way to try the features out is to work with the fuzzer_hevd / fuzzer_tlv_server modules. You can grab the target-hevd.7z / target-tlv_server.7z archives and extract them into the targets/ directory. The archives contain the directory trees that are expected for every targets:

  • inputs is the folder where your input test-cases go into,
  • outputs is the folder where the current minset files are saved into,
  • coverage is the folder where the .cov files are expected to be in,
  • crashes is where the crashes gets saved in,
  • state is where the memory dump (mem.dmp) as well as the CPU state (regs.json) and the symbol store are stored in (symbol-store.json). The symbol store is a simple JSON file that is used on Linux systems to know where to put breakpoints as there is no support for symbols / dbgeng on those platforms. wtf generates this file at runtime everytime you run your target on Windows.

Starting a server node

The server is basically the brain and keeps track of all the state: the aggregated code-coverage, the corpus, it generates and distributes the test-cases to client.

This is how you might choose to launch a server node:

(base) c:\work\codes\wtf\targets\hevd>..\..\src\build\wtf.exe master --max_len=1028 --runs=10000000 --target . --address tcp://192.168.2.41/

The max_len option is used to limit the size of the generated test-case, runs is the number of test-cases it will generate, address specify where wtf needs to be listening on and target is a directory with the directory tree we described above (the user can also choose to override those directories with --input / --output / --crashes).

Fuzzing nodes

The client nodes run a test-case that has been generated and distributed by the server and communicates the result back to the server (code-coverage, result, etc.).

This is how you would start a client node that uses the bochscpu backend:

..\..\src\build\wtf.exe fuzz --backend=bochscpu --name hevd --limit 10000000

The fuzz subcommand is used with the name option to specify which fuzzer module needs to be used, backend specifies the execution backend and limit the maximum number of instruction to execute per testcase (depending on the backend, this option has different meaning).

Running a test-case

If you would like to run a test-case (or a folder filled with test-cases), you can use the run subcommand.

This is how you would would run the crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0 test-case:

..\..\src\build\wtf.exe run --name hevd --state state --backend=bochscpu  --limit 10000000 --input crashes\crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0

Minseting a corpus

To minset a corpus, you need to use a server node and as many client nodes as you need like you would for a fuzzing job. You can simply set the runs optins to 0.

This is how you would minset the corpus in outputs into the minset directory (also highlights how you can override the inputs and outputs directories):

..\..\src\build\wtf.exe master --max_len=1028 --runs=0 --target . --inputs=outputs --outputs=minset

Generating execution traces

The main mechanism available to instrospect in an execution backend is to generate an execution trace. bochscpu is the only backend that has the ability to generate a complete execution traces so it is best to use for debugging purposes. The other backends only generate execution traces used to measure code-coverage (address of the first instruction in a basic block).

This is how you would generate an execution trace for the crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0 test-case:

..\..\src\build\wtf.exe run --name hevd --state state --backend=bochscpu  --limit 10000000 --input crashes\crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0 --trace-type=rip

The execution traces aren't symbolized because Linux systems wouldn't be able to do that and that is why I wrote symbolizer.

This is how you would symbolize the crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0.trace execution trace generated above:

symbolizer.exe --input crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0.trace --crash-dump state\mem.dmp

Generating Tenet traces

If you see yourself needing more contextual awareness, the bochscpu backend allows you to generate execution traces that can be loaded in the Tenet trace explorer. In the below, I start from a crash in memmove and walk back to find out where the source pointer is coming from (user-mode!):

..\..\src\build\wtf.exe run --name hevd --state state --backend=bochscpu  --limit 10000000 --input crashes\crash-0xfffff764b91c0000-0x0-0xffffbf84fb10e780-0x2-0x0 --trace-type=tenet

Generating code-coverage traces

To generate code-coverage traces you can simply use the run subcommand with the --trace-type=cov option.

This is how you would generate code-coverage traces for all the files inside the minset folder and store them in the coverage-traces folder:

..\..\src\build\wtf.exe run --name hevd --state state --backend=whv --input minset --trace-path=coverage-traces --trace-type=cov

Those traces aren't directly loadable into lighthouse because they aren't symbolized.

This is how you would symbolize all the files inside the coverage-traces folder and write the results into coverage-traces-symbolized:

symbolizer.exe --input coverage-traces --crash-dump state\mem.dmp -o coverage-traces-symbolized --style modoff

And finally, you can load those up in lighthouse:

Also if you don't care about individual code-coverage, the master maintains a coverage.cov file contains the unique aggregated code-coverage that has been exercised. It makes it easy to check on the global code-coverage really quick during a fuzzing job.

How does it work?

wtf runs user & kernel mode through an execution backend and relies on the user to insert test-cases in the target. Unlike other classical fuzzer tools, wtf doesn't do much of the heavy lifting; the user does. The user needs to know the harnessed target very well and onboarding a target is an iterative process that will take time. It has a lot of flexibility to offer if you are ready to get hacking though :)

The usual workflow to harness a target is as follows:

  1. Get your target running into a Hyper-V VM running Windows with one virtual CPU and 4GB of RAM.

  2. Put your target into the desired state using KD. For example, to target HEVD's IOCTL handler, I chose to stop the target in user-mode right before the client invokes DeviceIoControl. This will vary depending on your targets but you probably want it to be close to the code you want to fuzz.

    kd> r
    rax=000000dfd98ff3d0 rbx=0000000000000088 rcx=0000000000000088
    rdx=00000000deadbeef rsi=0000000000000000 rdi=0000000000000000
    rip=00007ff6f5bb111e rsp=000000dfd98ff380 rbp=0000000000000000
    r8=000000dfd98ff3d0  r9=0000000000000400 r10=000002263e823055
    r11=00007ff6f5bcb54d r12=0000000000000000 r13=0000000000000000
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up ei pl nz na po nc
    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
    hevd_client!main+0xae:
    00007ff6`f5bb111e ff15dc1e0100    call    qword ptr [hevd_client!_imp_DeviceIoControl (00007ff6`f5bc3000)] ds:002b:00007ff6`f5bc3000={KERNEL32!DeviceIoControlImplementation (00007ff8`3e2e6360)}
    
  3. Use bdump.js to generate the kernel crash-dump as well as the regs.json file that contains the CPU state. I recommend to dump those file in a state directory under your target directory (targets/hevd/state for example):

    kd> .scriptload c:\\work\\codes\\bdump\\bdump.js
    [bdump] Usage: !bdump "C:\\path\\to\\dump"
    [bdump] Usage: !bdump_full "C:\\path\\to\\dump"
    [bdump] Usage: !bdump_active_kernel "C:\\path\\to\\dump"
    [bdump] This will create a dump directory and fill it with a memory and register files
    [bdump] NOTE: you must include the quotes and escape the backslashes!
    JavaScript script successfully loaded from 'c:\work\codes\bdump\bdump.js'
    
    kd> !bdump_active_kernel "c:\\work\\codes\\wtf\\targets\\hevd\\state"
    [bdump] creating dir...
    [bdump] saving regs...
    [bdump] register fixups...
    [bdump] don't know how to get mxcsr_mask or fpop, setting to zero...
    [bdump]
    [bdump] don't know how to get avx registers, skipping...
    [bdump]
    [bdump] tr.base is not cannonical...
    [bdump] old tr.base: 0x7375c000
    [bdump] new tr.base: 0xfffff8047375c000
    [bdump]
    [bdump] setting flag 0x2000 on cs.attr...
    [bdump] old cs.attr: 0x2fb
    [bdump] new cs.attr: 0x22fb
    [bdump]
    [bdump] rip and gs don't match kernel/user, swapping...
    [bdump] rip: 0x7ff6f5bb111e
    [bdump] new gs.base: 0xdfd9621000
    [bdump] new kernel_gs_base: 0xfffff8046b6f3000
    [bdump]
    [bdump] non-zero IRQL in usermode, resetting to zero...
    [bdump] saving mem, get a coffee or have a smoke, this will probably take around 10-15 minutes...
    [bdump] Creating c:\work\codes\wtf\targets\hevd\state\mem.dmp - Active kernel and user memory bitmap dump
    [bdump] Collecting pages to write to the dump. This may take a while.
    [bdump] 0% written.
    [...]
    [bdump] 95% written. 1 sec remaining.
    [bdump] Wrote 1.5 GB in 23 sec.
    [bdump] The average transfer rate was 64.7 MB/s.
    [bdump] Dump successfully written
    [bdump] done!
    @$bdump_active_kernel("c:\\work\\codes\\wtf\\targets\\hevd\\state")
    
  4. Create a fuzzer module, write the code that inserts a test-case into your target and define the various conditions to detect crashes or the end of a test-case.

  5. You can also create your own mutator / generator by subclassing the Mutator_t interface. The fuzzer_tlv_server.cc is a good example to understand how you would go about implementing your own.

At this point you should start to iterate and verify that the fuzzer module works as expected. The execution backends are a blackbox so you should generate execution traces to make sure it goes through the right paths, does the right things. During this phase I mainly use the bochscpu backend as it is fully deterministic, starts fast, generating execution traces is possible, code-coverage comes for free, etc. Overall, it's a nicer environment to develop and prototype in.

Once you are satisfied with the module, you can start to look at making it work with the winhv / kvm backends if you need it to run under those. One major difference between the bochscpu backend & the others, is that the others use software breakpoints to provide code-coverage information. As a result, you'll need to load the modules you want coverage for under IDA and use the gen_coveragefile_ida.py script to generate a simple JSON file that gets loaded by wtf. You are free to generate this JSON file yourself using whatever tool you would like: it basically is a list of basic-blocks virtual addresses.

You can also target WoW64 applications by using the !wow64exts.sw Windbg command to switch to the 64-bit context right before creating the snapshot (thanks @cube0x8 for sharing this trick!):

32.kd:x86> !wow64exts.sw
The context is partially valid. Only x86 user-mode context is available.
Switched to Host mode

32.kd> !bdump "c:\\dump"

How to deliver multi-packets to my target?

Complex targets usually carry complex states as well and chances are that you might need to deliver more than one testcase in a session to trigger complex issues. tlv_server.cc is an example of such a server where exercising the parsing function with only one testcase won't be enough to uncover the bugs.

To handle this case, check out fuzzer_tlv_server.cc that shows an example of how to solve this problem.

How to provide a custom mutator / generator?

wtf comes with two popular generic mutators: libfuzzer & honggfuzz. You might want to provide your own or to generate testcases on your own as well.

To do that, you can subclass the Mutator_t interface, and register the function that instantiate your mutator when you define your fuzzing module:

class CustomMutator_t : public Mutator_t {
public:
  static std::unique_ptr<Mutator_t> Create(std::mt19937_64 &Rng,
                                           const size_t TestcaseMaxSize) {
    return std::make_unique<CustomMutator_t>(Rng, TestcaseMaxSize);
  }
  // ...
};

Target_t target("target", Init, InsertTestcase, Restore, CustomMutator_t::Create);

Check out the CustomMutator_t class in the fuzzer_tlv_server.cc module for a complete example.

Execution backends

In this section I briefly mention various differences between the execution backends.

bochscpu

  • ✅ Full system code-coverage,
  • ✅ Demand-paging,
  • ✅ Timeout is the number of instructions which is very precise,
  • ✅ Full execution traces are supported,
  • ✅ Fully deterministic,
  • ❌Speed seems to be good for short executions but not for long executions (~100x slower than KVM when I was fuzzing IDA).

whv

  • ✔ Code-coverage via software breakpoints,
  • ❌ Demand-paging so start-up is slow (as it needs to load the full crash-dump in memory),
  • ✔ Timeout is implemented with a timer,
  • ✔ Only code-coverage traces are supported,
  • ✔ Deterministic if handling source of non determinism manually (for example, patching nt!ExGenRamdom that uses rdrand),
  • ✔ Speed seems to be ok for long executions (lots of bottleneck in whv though; ~10x slower than WHV when I was fuzzing IDA).

KVM

  • ✔ Code-coverage via software breakpoints,
  • ✅ Demand-paging is supported via UFDD,
  • ✔ Timeout is implemented with a timer. ✅ If the hardware supports PMU virtualization, it is used to generate a PMI after X retired instructions (MSR_IA32_FIXED_CTR0),
  • ✔ Only code-coverage traces are supported,
  • ✔ Deterministic if handling source of non determinism manually (for example, patching nt!ExGenRamdom that uses rdrand),
  • ✅ Fastest for long executions (~500m - 1.5 billion instructions; ~100x faster than bochscpu, ~10x faster than whv when I was fuzzing IDA).

Build

The CI builds wtf on Ubuntu 20.04 using clang++-14 / g++-11 and on Windows using Microsoft's Visual Studio 2019.

To build it yourself you need to start a Visual Studio Developper Command Prompt and either run build-release.bat which uses the Ninja generator or build-release-msvc.bat to generate a Visual Studio solution file:

(base) wtf\src\build>build-release.bat
[...]
[2/2] Linking CXX executable wtf.exe

(base) wtf\src\build_msvc>..\build\build-release-msvc.bat
[...]
  Finished generating code
  wtf.vcxproj -> wtf\src\build_msvc\RelWithDebInfo\wtf.exe
  Building Custom Rule wtf/src/CMakeLists.txt

Authors