/bootloader_instrumentation_suite

Bootloader research tools (very much a work in progress)

Primary LanguagePythonMIT LicenseMIT

OVERVIEW

tldr; since most of this documentation is outdated
To run an example on x86_64 linux after installing, arm and aarch64 assume you have arm-linux-gnueabihf-* installed in /usr, and qemu-arm/qemu-aarch64. uboot assumes you have unpacked gcc linaro to ~/software/gcc-linaro-5.2-2015.11-x86_64_arm-linux-gnueabihf/ and built the special qemu
To use the uboot example you need to download the am37x technical reference from http://www.ti.com/lit/pdf/sprugn4 (TI's  "AM/DM37x Multimedia Device Technical Reference Manual (Silicon Revision 1.x) (Rev. R)")
and copy it to $(fiddle install path)/hw_info/bbxm/am37x_technical_reference.pdf
you will also need to push a copy of the beagleboard base SD image from https://cs.dartmouth.edu/~bx/beagleboard-xm-orig.img
to $(fiddle install path)/hw_info/bbxm/beagleboard-xm-orig.img

To work with the uboot/BBxM example, you

1. Enter target's directory
cd fiddle_examples/<example>
2. Tell fiddle to build example and construct static analysis databases (then go take a nap, it may take a while)
I_CONF=test.cfg fiddle -c
3; (optional) tell fiddle to import a policy to run analysis on (below is an example for test_arm)
I_CONF=test.cfg fiddle -I _single policies/example2/substages.yml policies/example2/regions.yml
4. tell fiddle to execute dynamic analysis (will also take some time)
I_CONF=test.cfg fiddle -R
5. (only for arm targets and if policy imported) tell it to postprocess writes (calculate write blocks)
I_CONF=test.cfg fiddle -p consolidate_writes
6. (only if #5 completed) tell it to check whether the last run had any policy violations
I_CONF=test.cfg fiddle -p policy_check

If you want to edit the policy, edit it and start from step 3.
If you change the test case, restart process from step 2

NB: This README is super out-of-date and will eventually be updated

This test suite helps you keep track of different versions of
u-boot/build tools, static analysis of that build's binaries, and
runtime trace results of running that binary on a given hardware
configuration. For each u-boot/build configuration it keeps a database
of information it statically gathered for each boot stage, boot stage
images/ELF files, a prepared SD card image, and test results of
runtime trace analyses.  If it detects changes in the u-boot source or
build tools it will create a new set of test result directories with a
new sdcard image and static analysis results.

LARGE PICTURE/TOOLSUITE DESIGN

A significant portion of this tool suite's codebase was written to
support an manage muliple target (bootloader) versions/builds,
emulator versions, hardware debugging tool builds/versions, and the
dependencies between/among different dynamic analyses and analyses of
data collected from an instrumented execution of the target.  Hence
this instrumentation suite has become the over-engineered monster with
which you are now becoming acquainted.

CONFIGURATION



TEST RESULTS DIRECTORY STRUCTURE

It stores the results in a directory tree in the following manner:

/ -- each subdirectory of the test data root contains results for a different u-boot/compiler configuration
/<test_config_instance> -- contains the results of a single bootloader configuration
/<test_config_instance>/bootsuite.yaml -- contains a cached version of the configuration file (more on this file later)
/<test_config_instance>/u-boot-<stage>-analysis.h5 -- static analyis information for a given stage stored in a pytables database
/<test_config_instance>/images -- contains copies of the boot stage ELF and raw images for this configuration as well as the sdcard image
/<test_config_instance>/trace-data -- contains reults from runtime analyses
/<test_config_instance>/trace-data/<date_time> -- a runtime trace analysis instance
/<test_config_instance>/trace-data/<date_time>/info.yamp -- information about this trace test instance
/<test_config_instance>/trace-data/<date_time>/<hardware_instance> -- results for a given hardware instance (ie. QemuBreakpointTrace, QemuWatchpointTrace) and any cached configuration files, log files, etc for this test session
/<test_config_instance>/trace-data/<date_time>/<hardware_instance>/<stage>-traces.h5 -- a database of dynamically gathered store operations ("trace database") for the given stage
/<test_config_instance>/trace-data/<date_time>/<hardware_instance>/<stage>-trace-histogram.txt -- A human-readable summary of trace information gathered during this run formatted as:

pc=<pc of write>/[<"virtual address" of write in ELF >file, pre-relocation>] (<name of function containing store instruction>) lr=<value of lr> (<name of function containing this lr address>) [<first address written to by this instruction>-<last address written to by this instruction>] (<size of a single write by this instruction in bytes>) <number of times this instruction was "consecutively invoked" (with respect to other write instructions) that wrote to this range> -- <disassembled store instruction>  -- <source code of store instruction>


As you may have noticed, all results are kept in 2 types of pytables
databases (any human-readable results are extracted from these
databses).  To explore these databases, I would recommend opening each
in a ipython shell and poking around.

There is the static analysis database (defined in staticanalysis.py) which contains:
a root group named "staticanalysis"
within "/staticanalysis", 7 separate tables
- "relocs" -- Manually generated relocation information (such as what ranges of addresses get relocated during stage and to where)
- "writes" -- Lists every memory store instruction statically found in the binary's ".text" section. Includes information such as pc, if it's thumb, what registers we must read in order to calculate the destination of the write, how many bytes it writes
- smcs -- Location of every "smc" instruction (needed for baremetal debugging so we don't try to step through ROM instructions)
- srcs -- Information on a given instruction such as file/line number, disassmbly, address.
- funcs -- Information on a given function (name, start and end addr)
- longwrites -- Manually generated information on looped writes so that we can generate tracing results of this loop without breaking or stepping through every loop iteration
- skips -- Manually generated information on instruction ranges the baremetal debugger should "skip" (not step through)

We create a separate one of these databases per boot stage.

The second type of database is a trace table -- generated from runtime tracing information. This table is defined in database.py.
It has 1 group, which is its stage name. This one group contains 2 tables:
- "writes": an entry for each store instruction that occurs at runtime. Includes information such as pc (both as seen in ELF and perhaps a different relocated address), destination of write, size of write, lr, cpsr
- "writerange": a summary of the write tables that consolodates information on sequential writes (of which the writehistogram is generated from)


[INSTALL]

I recommend setting up a separate python environment for this
toolsuite using a tool such as virtualenv/virtualenvwrapper.

You can install the tools via:
> setup.py install

SOFTWARE REQUIREMENTS
- git
- capstone (next branch -- https://github.com/aquynh/capstone/tree/next, I tested this with this commit -- https://github.com/aquynh/capstone/tree/c5dce55db4907f8f3bd326fd96d0b6a613b18e66 )
- gdbm (GNU database library)
- libgit2
- gdb with python2 bindings (most newer versions of gdb support python, however some only have python3, not python2)
- Python 2 (I recommend version 2.7.11)
- The following python2 libraries:
  - doit (version 0.29.0)
  - PyYAML
  - pathlib
  - pygit2
  - ipython
  - intervaltree
  - tables
  - capstone
  - unicorn
  - numpy
  - sortedcontainers
  - pdfminer
  - munch
  - functools32
  - toml.py
  - (optional) ipython v5.7.0 (for embeeded console support)
  - (optional, only if working with qemu watchpoint) qemu's tracetool.py -- https://raw.githubusercontent.com/qemu/qemu/86b5aacfb972ffe0fa5fac6028e9f0bc61050dda/scripts/tracetool.py , which needs to be placed somewhere in your python path)



(may not be complete, but will list new ones as they are found)

This test suite is written for python2.7, so make sure that is the
version of python you are using (it will not work with python 3)

You will need sudo access on your machine if you are debugging a bootloader.


[Test target configuration]

Fiddle searches for and loads configuration file at ~/.fiddle.cfg or
the value of the I_CONF environmental variable.  Any missing
configuration from this file will be extracted from configs/defaults.cfg.



Main/root: the root directory of which most all other paths are
calculate from (the one exception to this is openocd_cfg paths,
Software binary paths, and Software root paths who have the 'path:
full' option set).

test_data_path: relative location from Main/root where test data
should be stored

test_suite_path: relative location from Main/root where the directory
containing this readme is store (bootloader_test_suite)

cc: absolute location of CROSS_COMPILER path

HardwareClass/sdskeleton: path to a sdcard image for which the test
suite can copy to install new bootstage images


Other things you may want to set:

Software configuration:
For any given Software entry you may need to set the value of root.
The value of root typically should be the relative path to that the
root of that software's source code from Main/root.  However if "path:
cc" is set, then the path is relative from Main.cc.  If "path: full"
is set, then the path is relative to your system's root directory.

Also you may end up needing to tweak the build settings for a given
piece of software. Build configuration can often be tweaked by setting
the software's build_prepare (if build is set to StandardBuilder).
However some pieces of software need test suite configuration
information to properly configure themselves.  In these cases the
build option is not set to StandardBuilder and there is a class of the
build option's name in config.py that calculates the build information
(example: the UBootBuilder class)

Main/booloader: it's value corresponds to a 'name' of a Bootloader
entry in the configuration.  It allows you to select between different
bootloader configurations.  A bootloader configuration selects what
configuration (defconfig) it should biuld against and what stages it
supports. Each item a Bootloader entry lists under stages corresponds
to a Bootstages class with that name.



USAGE

The test suite uses pydoit to manage data collection requirements.
For example, the test suite needs to preprocess the bootloader binary
before it is able to execute and instrument the bootloader.


This creates the directories it needs for its test results directory,
builds u-boot, creates a new sd image, and runs all static analysis.
If a test directory with the same bootloader/build tools configuration
exists it will first delete that directory.

This testsuite will detect any changes in the u-boot source code or
gcc (and in the future any gcc plugins) and create a new
test_config_instance directory along with a new sdcard image and
static analysis database when it is invoked.


You must first invoke the test suite via: main.py -c
  - this will cause it to preprocess the bootloader using options specified in bootsuite.yaml

Then if you execute main.py -r
  - It will execute, instrument, and collect data from the binary

You can then postprocess collected data via
  - main.py -p consolidate_writes
  - main.py -p browse_db
  - main.py -p policy_check

Try to not invoke multiple instances of main.py at once as it may mess
up your test results.

If any bulids fail you can either ask it to rebuild all software via:
main.py -b

Or you can ask it for the build commands it uses on a piece of software via
main.py -B <software_name>
(sotware_name as it is written in the software's configuration in bootsuite.yaml)


You can ask the test suite to print commands it uses to execute and instrument:
main.py -p
hardware (or emulator) and to run the tracing software.

All tracing tools, with the exception of QEMU watchpoints, will
directly import their tracing data into the trace database and create the
human-readable writehistogram file.

PYTHON MODULES TO INSTALL
interval
yamlconf
pygit2
pyinter
ipython
sortedcontainers
numpy
numexpr
tables
doit