================== american fuzzy lop ================== Written and maintained by Michal Zalewski <lcamtuf@google.com> Copyright 2013, 2014, 2015 Google Inc. All rights reserved. Released under terms and conditions of Apache License, Version 2.0. For new versions and additional information, check out: http://lcamtuf.coredump.cx/afl/ To compare notes with other users or get notified about major new features, send a mail to <afl-users+subscribe@googlegroups.com>. 1) Challenges of guided fuzzing ------------------------------- Fuzzing is one of the most powerful and proven strategies for identifying security issues in real-world software; it is responsible for the vast majority of remote code execution and privilege escalation bugs found to date in security-critical software. Unfortunately, fuzzing also offers fairly shallow coverage, because many of the mutations needed to reach new code paths are exceedingly unlikely to be hit purely by chance. There have been numerous attempts to solve this problem by augmenting the process with additional information about the behavior of the tested code, ranging from simple corpus distillation, to flow analysis (aka "concolic" execution), to pure symbolic execution, to static analysis. The first method on that list has been demonstrated to work well, but depends on the availability of a massive, high-quality corpus of valid input data. On top of this, coverage measurements provide only a fairly simplistic view of program state, making them less suited for guiding the fuzzing process later on. The remaining techniques are extremely promising in experimental settings, but frequently suffer from reliability problems or irreducible complexity. Most of the high-value targets have enough internal states and possible execution paths to make such tools fall apart and perform strictly worse than their traditional counterparts, at least until fine-tuned with utmost care. 2) The afl-fuzz approach ------------------------ American Fuzzy Lop is a brute-force fuzzer coupled with an exceedingly simple but rock-solid instrumentation-guided genetic algorithm. It uses an enhanced form of edge coverage to easily detect subtle, local-scale changes to program control flow, without being bogged down by complex comparisons between multiple long-winded execution paths. Simplifying a bit, the overall algorithm can be summed up as: 1) Load user-supplied initial test cases into the queue, 2) Take next input file from the queue, 3) Attempt to trim the test case to the smallest size that doesn't alter the measured behavior of the program, 4) Repeatedly mutate the file using a balanced and well-researched variety of traditional fuzzing strategies, 5) If any of the generated mutations resulted in a new state transition recorded by the instrumentation, add mutated output as a new entry in the queue. 6) Go to 2. The discovered test cases are also periodically culled to eliminate ones that have been obsoleted by newer, higher-coverage finds, and undergo several other instrumentation-driven effort minimization steps. The strategies mentioned in step 4 are fairly straightforward, but go well beyond the functionality of tools such as zzuf and honggfuzz and lead to additional finds; this is discussed in more detail in technical_notes.txt. As a side result of the fuzzing process, the tool creates a small, self-contained corpus of interesting test cases. These are extremely useful for seeding other, labor- or resource-intensive testing regimes - for example, for stress-testing browsers, office applications, graphics suites, or closed-source tools. The fuzzer is thoroughly tested to deliver coverage far superior to blind fuzzing or coverage-only tools without the need to dial in any settings or adjust any knobs. 3) Instrumenting programs for use with AFL ------------------------------------------ When source code is available, instrumentation can be injected by a companion tool that works as a drop-in replacement for gcc or clang in any standard build process for third-party code. The instrumentation has a fairly modest performance impact; in conjunction with other optimizations implemented by afl-fuzz, most programs can be fuzzed as fast or even faster than possible with traditional tools. The correct way to recompile the target program may vary depending on the specifics of the build process, but a nearly-universal approach would be: $ CC=/path/to/afl/afl-gcc ./configure $ make clean all For C++ programs, you will want: $ CXX=/path/to/afl/afl-g++ ./configure The clang wrappers (afl-clang and afl-clang++) are used in the same way; clang users can also leverage a higher-performance instrumentation mode described in llvm_mode/README.llvm. When testing libraries, it is essential to either link the tested executable against a static version of the instrumented library, or to set the right LD_LIBRARY_PATH. Usually, the simplest option is just: $ CC=/path/to/afl/afl-gcc ./configure --disable-shared Setting AFL_HARDEN=1 when calling 'make' will cause the CC wrapper to automatically enable code hardening options that make it easier to detect simple memory bugs. The cost of this is a <5% performance drop. Oh: when using ASAN, see the notes_for_asan.txt file for important caveats. 4) Instrumenting binary-only apps --------------------------------- When fuzzing closed-source programs that can't be easily recompiled with afl-gcc, the fuzzer offers experimental support for fast, on-the-fly instrumentation of black-box binaries. This is accomplished with a version of QEMU running in the lesser-known "user space emulation" mode. QEMU is a project separate from AFL, but you can conveniently build the feature by doing: $ cd qemu_mode $ ./build_qemu_support.sh For additional instructions and caveats, see qemu_mode/README.qemu. The mode isn't free; compared to compile-time instrumentation, the fuzzing process will be approximately 2-5x slower; it is also less conductive to parallelization on multiple cores. 5) Choosing initial test cases ------------------------------ To operate correctly, the fuzzer requires one or more starting file containing the typical input normally expected by the targeted application. There are two basic rules: - Keep the files small. Under 1 kB is ideal, although not strictly necessary. For a discussion of why size *really* matters, see perf_tips.txt. - Use multiple test cases only if they are fundamentally different from each other. There is no point in using fifty different vacation photos to fuzz an image library. You can find quite a few good examples of starting files in the testcases/ subdirectory that comes with this tool. If a large corpus of data is available for screening, you may want to use the afl-cmin utility to reject redundant files - ideally, with an aggressive timeout (-t); afl-showmap can be used to manually examine and compare execution traces, too. 6) Fuzzing binaries ------------------- The fuzzing process itself is carried out by the afl-fuzz utility. The program requires a read-only directory with initial test cases, a separate place to store its findings, plus a path to the binary to test. For programs that accept input directly from stdin, the usual syntax may be: $ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program [...params...] For programs that take input from a file, use '@@' to mark the location where the input file name should go. The fuzzer will substitute this for you: $ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program -r @@ You can also use the -f option to have the mutated data written to a specific file. This is useful if the program expects a particular file extension or so. Non-instrumented binaries can be fuzzed in the QEMU mode by adding -Q in the command line. It is also possible to use the -n flag to run afl-fuzz in plain old non-guided mode. This gives you a fairly traditional fuzzer with a couple of nice testing strategies. You can use -t and -m to override the default timeout and memory limit for the executed process; this is seldom necessary, perhaps except for video decoders or compilers. Tips for optimizing the performance of the process are discussed in perf_tips.txt. Note that the fuzzer starts by meticulously performing an array of deterministic fuzzing steps, which can take several days. If you want more traditional behavior akin to zzuf or honggfuzz, use the -d option to get quick but less systematic and less in-depth results right away. 7) Interpreting output ---------------------- The fuzzing process will continue until you press Ctrl-C. See the status_screen.txt file for information on how to interpret the displayed stats and monitor the health of the process. At the *very* minimum, you want to allow the fuzzer to complete one queue cycle, which may take anywhere from a couple of hours to a week or so. There are three subdirectories created within the output directory and updated in real time: - queue/ - test cases for every distinctive execution path, plus all the starting files given by the user. This is, in effect, the synthesized corpus mentioned in section 2. If desired, you can use afl-cmin to shrink the corpus to a much smaller size. This works by throwing away earlier inputs that used to trigger unique behaviors in the past, but have been made obsolete by better finds made by afl-fuzz later on. - hangs/ - unique test cases that cause the tested program to time out. Note that the default timeouts are fairly aggressive (set at 5x the average execution time) to keep things moving fast. - crashes/ - unique test cases that cause the tested program to receive a fatal signal (e.g., SIGSEGV, SIGILL, SIGABRT). The entries are grouped by the received signal. Crashes and hangs are considered "unique" if the associated execution paths involve any state transitions not seen in previously-recorded faults. If a single bug can be reached in multiple ways, there will be some count inflation early in the process, but this should quickly taper off. The file names for crashes and hangs should let you correlate them with the parent, non-faulting queue entries. This should help with debugging. When you can't reproduce a crash found by afl-fuzz, the most likely cause is that you are not setting the same memory limit as used by the tool. Try: $ LIMIT_MB=50 $ ( ulimit -Sv $[LIMIT_MB << 10]; /path/to/tested_binary ... ) Change LIMIT_MB to match the -m parameter passed to afl-fuzz. On OpenBSD, also change -Sv to -Sd. Any existing output directory can be also used to resume aborted jobs; try: $ ./afl-fuzz -i- -o existing_output_dir [...etc...] If you have gnuplot installed, you can also generate some pretty graphs for any active fuzzing task using 'afl-plot'. For an example of how this looks like, see http://lcamtuf.coredump.cx/afl/plot/. 8) Parallelized fuzzing ----------------------- Every instance of afl-fuzz takes up roughly one core. This means that on multi-core systems, parallelization is necessary to fully utilize the hardware. For tips on how to fuzz a common target on multiple cores or multiple networked machines, please refer to parallel_fuzzing.txt. 9) Fuzzer dictionaries ---------------------- By default, afl-fuzz mutation engine is optimized for compact data formats - say, images, multimedia, compressed data, regular expression syntax, or shell scripts. It is somewhat less suited for languages with particularly verbose and redundant verbiage - notably including HTML, SQL, or JavaScript. To avoid the hassle of building syntax-aware tools, afl-fuzz provides a way to seed the fuzzing process with an optional dictionary of language keywords, magic headers, or other special tokens associated with the targeted data type - and use that to reconstruct the underlying grammar on the go: http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html To use this feature, you first need to create a dictionary in one of the two formats discussed in testcases/README.testcases; and then point the fuzzer to it via the -x option in the command line. There is no way to provide more structured descriptions of the underlying syntax, but the fuzzer will likely figure out some of this based on the instrumentation feedback alone. This actually works in practice, say: http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html PS. Even when no explicit dictionary is given, afl-fuzz will try to extract existing syntax tokens in the input corpus by watching the instrumentation very closely during deterministic byte flips. This works for some types of parsers and grammars, but isn't nearly as good as the -x mode. 10) Crash triage ---------------- The coverage-based grouping of crashes usually produces a small data set that can be quickly triaged manually or with a very simple GDB or Valgrind script. Every crash is also traceable to its parent non-crashing test case in the queue, making it easier to diagnose faults. Having said that, it's important to acknowledge that some fuzzing crashes can be difficult quickly evaluate for exploitability without a lot of debugging and code analysis work. To assist with this task, afl-fuzz supports a very unique "crash exploration" mode enabled with the -C flag. In this mode, the fuzzer takes one or more crashing test cases as the input, and uses its feedback-driven fuzzing strategies to very quickly enumerate all code paths that can be reached in the program while keeping it in the crashing state. Mutations that do not result in a crash are rejected; so are any changes that do not affect the execution path. The output is a small corpus of files that can be very rapidly examined to see what degree of control the attacker has over the faulting address, or whether it is possible to get past an initial out-of-bounds read - and see what lies beneath. Oh, one more thing: for test case minimization, give afl-tmin a try. The tool can be operated in a very simple way: $ ./afl-tmin -i test_case -o minimized_result -- /path/to/program [...] The tool works with crashing and non-crashing test cases alike. In the crash mode, it will happily accept instrumented and non-instrumented binaries. In the non-crashing mode, the minimizer relies on standard AFL instrumentation to make the file simpler without altering the execution path. The minimizer accepts the -m, -t, -f and @@ syntax in a manner compatible with afl-fuzz. 11) Common-sense risks ---------------------- Please keep in mind that, similarly to many other computationally-intensive tasks, fuzzing may put strain on your hardware and on the OS. In particular: - Your CPU will run hot and will need adequate cooling. In most cases, if cooling is insufficient or stops working properly, CPU speeds will be automatically throttled. That said, especially when fuzzing on less suitable hardware (laptops, smartphones, etc), it's not entirely impossible for something to blow up. - Targeted programs may end up erratically grabbing gigabytes of memory or filling up disk space with junk files. AFL tries to enforce basic memory limits, but can't prevent each and every possible mishap. The bottom line is that you shouldn't be fuzzing on systems where the prospect of data loss is not an acceptable risk. - Fuzzing involves billions of reads and writes to the filesystem. On modern systems, this will be usually heavily cached, resulting in fairly modest "physical" I/O - but there are many factors that may alter this equation. It is your responsibility to monitor for potential trouble; with very heavy I/O, the lifespan of many HDDs and SSDs may be reduced. A good way to monitor disk I/O on Linux is the 'iostat' command: $ iostat -d 3 -x -k [...optional disk ID...] 12) Known limitations & areas for improvement --------------------------------------------- Here are some of the most important caveats for AFL: - AFL detects faults by checking for the first spawned process dying due to a signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for these signals may need to have the relevant code commented out. In the same vein, faults in child processed spawned by the fuzzed target may evade detection unless you manually add some code to catch that. - As with any other brute-force tool, the fuzzer offers limited coverage if encryption, checksums, cryptographic signatures, or compression are used to wholly wrap the actual data format to be tested. To work around this, you can comment out the relevant checks (see experimental/libpng_no_checksum/ for inspiration); if this is not possible, you can also write a postprocessor, as explained in experimental/post_library/. - There are some unfortunate trade-offs with ASAN and 64-bit binaries. This isn't due to any specific fault of afl-fuzz; see notes_for_asan.txt for tips. - There is no direct support for fuzzing network services, background daemons, or interactive apps that require UI interaction to work. You may need to make simple code changes to make them behave in a more traditional way. Preeny may offer a relatively simple option, too - see: https://github.com/zardus/preeny - AFL doesn't output human-readable coverage data. If you want to monitor coverage, use afl-cov from Michael Rash: https://github.com/mrash/afl-cov Beyond this, see INSTALL for platform-specific tips. 13) Special thanks ------------------ Many of the improvements to afl-fuzz wouldn't be possible without feedback, bug reports, or patches from: Jann Horn Hanno Boeck Felix Groebert Jakub Wilk Richard W. M. Jones Alexander Cherepanov Tom Ritter Hovik Manucharyan Sebastian Roschke Eberhard Mattes Padraig Brady Ben Laurie @dronesec Luca Barbato Tobias Ospelt Thomas Jarosch Martin Carpenter Mudge Zatko Joe Zbiciak Ryan Govostes Michael Rash William Robinet Jonathan Gray Filipe Cabecinhas Nico Weber Jodie Cunningham Andrew Griffiths Parker Thompson Jonathan Neuschfer Tyler Nighswander Ben Nagy Samir Aguiar Aidan Thornton Aleksandar Nikolich Sam Hakim Laszlo Szekeres David A. Wheeler Turo Lamminen Andreas Stieger Richard Godbee Louis Dassy Thank you! 14) Contact ----------- Questions? Concerns? Bug reports? The author can be usually reached at <lcamtuf@google.com>. There is also a mailing list for the project; to join, send a mail to <afl-users+subscribe@googlegroups.com>. Or, if you prefer to browse archives first, try: https://groups.google.com/group/afl-users PS. If you wish to submit raw code to be incorporated into the project, please be aware that the copyright on most of AFL is claimed by Google. While you do retain copyright on your contributions, they do ask people to agree to a simple CLA first: https://cla.developers.google.com/clas Sorry about the hassle. Of course, no CLA is required for feature requests or bug reports.