/vuzzer

Primary LanguageCApache License 2.0Apache-2.0

VUzzer

About

This Project depends heavily on a modeified version of DataTracker, which in turn depends on LibDFT pintool. It has some extra tags added in libdft. DataTracker original repo https://github.com/m000/dtracker.

Running the VUzzer:

Please see wikiHOWTO.md for a step-by-step procedure to run the VUzzer. This file also contains explanation for most of the options.

Requirements

DataTracker runs on 32bit Linux systems. This limitation is imposed by the current version of libdft. However, the methods of both software are not platform-specific. So, in principle, they can be ported on any platform supported by Intel Pin. The requirements for running DataTracker are:

  • A C++11 compiler and unix build utilities (e.g. GNU Make).
  • A recent (>=2.13) version of Intel Pin. The framework must be present in directory pin inside the VUzzer top directory. A simple way to do so is to create a symbolic link pointing to your pin directory.
$ cd vuzzer
$ ln -s /path-to-pin-home pin

We have tested VUzzer by running in on VirtualBox, with Ubuntu 14.04 LTS (32-bit), Linux 3.16.0.32 image. It should be noted that with kernel 4.x.y, Pin (2.13) gets panic. We recommend setting up the same environment to use VUzzer. This limitation will be addressed in the future release of VUzzer with 64-bit support.

Installation

First do cd vuzzer and then

export PIN_ROOT=($pwd)/pin

If initially libdft has been made then go to support/libdft/src and do make clean Again in the parent folder execute following

make support-libdft
make
make -f mymakefile

If all above steps were successfull, obj-ia32/dtracker.so and obj-i32/bbcounts2.so will be created. This is Pin tool containing all the instrumentation required to perform taintflow and basic block level tracing.

Changing the tags

Currently there are 4 custom tags

  • libdft_tag_set_fdoff
  • libdft_tag_bitset
  • libdft_tag_ewah
  • libdft_tag_bvector

Default tag is ```libdft_tag_ewag``. To change the tag you need to change following two files:

  • Makefile.rules in the root directory

    • Change LIBDFT_TAG_FLAGS accordingly from line #12 to line #15
  • makefile.libdft present in support/libdft directory.

    • Change LIBDFT_TAG_FLAGS accoringly from line #3 to line #6

*** Note: Use same LIBDFT_TAG_FLAGS in both make file and make sure you do make clean for libdft before building libdft again ***

Runnning

Capturing raw provenance

To capture provenance from a program, launch it from the unix shell using something like this:

./pin/pin.sh -follow_execv -t ./obj-ia32/dtracker.so -filename <name_of_file> -- <program> <args>

Compulsory Knob:

  • -filename <name_of_file>

*** Note: Please ensure that you supply the name of file of which you want to know the taint information. Otherwise there would be no taint propogation. ***

The command runs the program under Pin In addition to the standard Pin knobs, DataTracker additionally supports these tool-specific knobs:

  • -stdin [1|0]: Turns tracking of data read from the standard input on or off. Default if off.
  • -stdout [1|0]: Turns logging of provenance of data written to standard output on or off. Default if on.
  • -stderr [1|0]: Turns logging of provenance of data written to standard error on or off. Default if off.
  • -maxoff integer_val: Puts the limit on the size of the taint offsets of cmp instruction. Default is 4.
  • -maxlea integer_val: Puts the limit on the size of the taint offsets of lea instruction. Default is 4.

Note that launching large programs using the method above takes a lot of time. For such programs, it is suggested to first launch the program and then attach DataTracker to the running process like this:

./pin/pin.sh -follow_execv -pid <pid> -t ./obj-ia32/dtracker.so <knobs>

The raw provenance generated by DataTracker is contained in file rawprov.out. Any additional debugging information are written in file pintool.log.

CMP Output Format (cmp.out)

cmp.out will contain all those compare instructions whose operand is tainted by some offset of file. All instructions will be represented by a row containing 13 space separated values as below:

Bit-operation cmp-type ins-address dest[0] dest[1] dest[2] dest[3] src[0] src[1] src[2] src[3] dest_val src_val

8 reg reg 0x08048532 {0} {} {} {} {2} {} {} {} Z a