/joern-old

Old version of joern used for ACSAC'12 paper. Only kept around for archiving.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

joern

joern is a tool for robust analysis of C/C++ code. It generates abstract syntax trees, control flow graphs and searchable indexes of code constructs, even for code that does not compile due to missing headers. As such, it has been specifically designed to meet the needs of code auditors, who often find themselves in a situation where constructing a working build environment is not a feasible option or is simply impossible due to missing code.

joern enables you as a code auditor to write quick-and-dirty but language aware static analysis tools. To achieve this, it writes all acquired information to disk as text files or serialized Python objects, thus providing simple and direct access to the data.

Installation:

joern is written in Python2. To install it, execute the following:

$ sudo python2 setup.py install

You will also require a Java Virtual Machine to run CodeSensor.

Usage:

  1. Parsing

To parse a codebase, execute the following:

$ joern_parser $path_to_codebase

This will create a directory named '.$codebase' containing the results generated by the parser, where $codebase is the name of the directory containing the codebase.

For each filename, the generated directory .$codebase contains the following entries:

filename/source: The original source file

filename/ast.csv: The source file's abstract syntax tree in a grep'able version.

filename/ast.pickl: The source file's abstract syntax tree saved as a pickle'd Python object.

filename/funcname/cfg.pickl: The functions control flow graph saved as a pickle'd Python object.

  1. Filtering

The saved ASTs and CFGs contain all information generated by the parser. To concentrate your analysis only on certain types of nodes, you can use joern_filter_asts and joern_filter_cfgs respectively.

First, run the following:

$ joern_filter_asts .$codebase $ joern_filter_cfgs .$codebase

This will filter ASTs and CFGs using a default filter and create the following files:

filename/funcname/prunedCfg.pickl filename/funcname/prunedAst.pickl

You can design your own filter by specifying nodes of interest as command line parameters to joern_filter_asts and joern_filter_cfgs. Run joern_filter_asts --help and joern_filter_cfgs --help for more information. Alternatively, you can design your own filters and row2string converters and place them in sourceutils/pythonASTFilter/pruning and sourceutils/pythonCFGFilter/pruning respectively. Take a look at the existing scripts in these directories for more information.

  1. Indexing

Run the following to create index files:

$ joern_index .$codebase

callIndex.pickl: Python dictionary mapping the names of functions to the list of locations where they are called.

conditionIndex.pickl Python dictionary mapping conditions to the locations where they are imposed.

declarationIndex.pickl Python dictionary mapping the names of types to the locations where they are used to declare a variable.

functionIndex.pickl: Python dictionary mapping the names of functions to function definitions with that name.

Take a look at sourceutils/codeIndex/CodeIndexCreator.py to see how simple it is to create these indexes based on the data in .$codedir. You can add any index you require to be generated in this file.

  1. Visualization

joern provides some very basic functionality to visualize abstract syntax trees and control flow graphs. This is mainly intended for debugging, i.e. to make sure that the filters you define generate the expected output.

For example, to plot a filtered CFG, run the following:

$ ./joern_plot filename/funcname/prunedCfg.pickl

Credits:

Developed by: Fabian 'fabs' Yamaguchi (University of Goettingen)

Greetings: @trapflag, @nion, @mlsec, @teh_gerg, @sergeybratus, @joernchen