/SanitizerSymbolizerTool

__sanitizer::SymbolizerTool ecosystem as a standalone library

Primary LanguageC++OtherNOASSERTION

SanitizerSymbolizerTool

Introduction

Fuzzers from the AFL family strive to avoid the step of symbolize which turns virtual addresses to file/line locations when working with Sanitizers. Because

Similarly, include symbolize=0, since without it, AFL++ may have difficulty telling crashes and hangs apart.

(according to 12) Third-party variables set by afl-fuzz & other tools)

They do this by

set abort_on_error and symbolize for all the four sanitizer flags

(see #1618 and #1624)

However, sometimes we may still want to check symbol names in the report provided by sanitizer when fuzzing, such as utilizing backtrace info as feedback. One possibe way to do this is symbolizing addresses and offsets data outside of sanitizer runtime linked in fuzz target, i.e., use symbolizer in fuzzer itself when necessary.

SanitizerSymbolizerTool helps to implement this. We strip __sanitizer::SymbolizerTool and related dependencies from compiler-rt, and wrapper them as a standalone library. After introducing it, the fuzzer can use external individual "tools" that can perform symbolication by statically analysing target binary (currently, only llvm-symbolizer and addr2line are supported), with a similar style which implemented in sanitizer runtime.

Currently it doesn't support Windows platform. Fully migrating compiler-rt across-platform features will be done in future. But Fuchsia will never be supported due to a lack of relevant docs.

One more thing, it's NOT THREAD SAFE.

SanitizerSymbolizerTool is under the Apache License v2.0 with LLVM Exceptions (same as llvm/llvm-project). See LICENSE for more details.

Building

Dependencies

  • cmake (version 3.13.4 or newer)
  • clang & llvm (at least 12.0.0)

The source code itself doesn't depend on any special libraries or features from llvm/llvm-project. But to preserve the LLVM-building-style, build-system of compiler-rt is migrated and reused. I tried 16.0.3 Release first, but the shared cmake modules are not only located in llvm-project/compiler-rt/cmake, but also in llvm-project/cmake, which makes things get more complicated. Next I happened to find a copy of 12.0.0 Release on my laptop, which has a truely standalone cmake stuffs in compiler-rt. So I used it anyway, and that's why we have source code based on 16.0.3 and build-system based on 12.0.0.

Start to build

Run:

  • cd build
  • cmake .. -DLLVM_CONFIG_PATH=/path/to/llvm-config
  • make

The library can be installed to your system with make install command then.

Some common options when calling cmake (for more information see Building LLVM with CMake):

  • -DCMAKE_INSTALL_PREFIX=directory --- Specify for directory the full path name of where you want SanitizerSymbolizerTool library to be installed (default /usr/local).

  • -DCMAKE_BUILD_TYPE=type --- Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug.

Usage

Quick start

Include sanitizer_symbolizer_tool.h in your project, use those APIs and link with the library built before when compiling. If you use the static library, your final executable will need to link with standard C++ library, otherwise you will get undefined references.

For llvm-symbolizer, you need a version of it that is not too old - at least from LLVM version 12.0.1 after some tests. Otherwise SanSymTool_init will fail since some command line options are not supported.

For addr2line, versions from GNU Binutils 2.30 or newer are suggested. Older versions have not been tested.

Learn more

There are some interesting stuffs in ./demo which can help you explore and learn more about this project.

  • bug-san0-dbg0-64

    bug-san0-dbg0-64.bin is one of ELF binaries built from microBug. And bug-san0-dbg0-64-visualize.html & -mem-map.svg show the structure of this ELF file. The SVG file is generated by drawio-desktop from bug-san0-dbg0-64-mem-map.drawio.

  • big-symbol

    big-symbol.cpp is used to challenge a symbolizer with some inline functions and large function names. The four binaries are built by

    clang++-12 -fPIC -pie -O0 -g -o big-symbol-elf-dbg1-pie1.bin  ./big-symbol.cpp;
    clang++-12 -fPIC -pie -O0    -o big-symbol-elf-dbg0-pie1.bin  ./big-symbol.cpp;
    clang++-12            -O0 -g -o big-symbol-elf-dbg1-pie0.bin  ./big-symbol.cpp;
    clang++-12            -O0    -o big-symbol-elf-dbg0-pie0.bin  ./big-symbol.cpp;

    on Ubuntu 18.04.6 LTS (x86_64).

  • checksum.txt

    Give MD5 checksums of the five *.bin files to detect unexpected data corruption when distributing on the internet.

  • DispMemOffset.sh

    Use GDB to check the relocation when running an ELF binary built as PIE(Position-Independent Executable).

  • simple_demo

    simple_demo.c is an example of using SanitizerSymbolizerTool to check each address in .text, .data and .bss of bug-san0-dbg0-64.bin. You can modify it to check other four binaries. These macros may help you:

    • for big-symbol-elf-dbg*-pie1.bin
      #define SEC_HEAD_TEXT 0x0007a0U
      #define SEC_TAIL_TEXT 0x003072U
      #define SEC_HEAD_DATA 0x206048U
      #define SEC_TAIL_DATA 0x206060U
      #define SEC_HEAD_BSS  0x206060U
      #define SEC_TAIL_BSS  0x208780U
    • for big-symbol-elf-dbg*-pie0.bin
      #define SEC_HEAD_TEXT 0x400680U
      #define SEC_TAIL_TEXT 0x402f42U
      #define SEC_HEAD_DATA 0x606050U
      #define SEC_TAIL_DATA 0x606060U
      #define SEC_HEAD_BSS  0x606060U
      #define SEC_TAIL_BSS  0x608780U

    simple_demo_dummy_build.sh roughly builds simple_demo with all available source code instead of linking with the pre-built library, which is used for a quick debug check.