joern is a tool for robust analysis of C/C++ code. It generates abstract syntax trees, control flow graphs and searchable indexes of code constructs, even for code that does not compile due to missing headers. As such, it has been specifically designed to meet the needs of code auditors, who often find themselves in a situation where constructing a working build environment is not a feasible option or is simply impossible due to missing code.
joern enables you as a code auditor to write quick-and-dirty but language aware static analysis tools. To achieve this, it writes all acquired information to disk as text files or serialized Python objects, thus providing simple and direct access to the data.
joern is written in Python2. To install it, execute the following:
$ sudo python2 setup.py install
You will also require a Java Virtual Machine to run CodeSensor.
- Parsing
To parse a codebase, execute the following:
$ joern_parser $path_to_codebase
This will create a directory named '.$codebase' containing the results generated by the parser, where $codebase is the name of the directory containing the codebase.
For each filename, the generated directory .$codebase contains the following entries:
filename/source:
The original source file
filename/ast.csv:
The source file's abstract syntax tree in a grep'able version.
filename/ast.pickl:
The source file's abstract syntax tree saved as a pickle'd
Python object.
filename/funcname/cfg.pickl:
The functions control flow graph saved as a pickle'd Python
object.
- Filtering
The saved ASTs and CFGs contain all information generated by the parser. To concentrate your analysis only on certain types of nodes, you can use joern_filter_asts and joern_filter_cfgs respectively.
First, run the following:
$ joern_filter_asts .$codebase
$ joern_filter_cfgs .$codebase
This will filter ASTs and CFGs using a default filter and create the following files:
filename/funcname/prunedCfg.pickl
filename/funcname/prunedAst.pickl
You can design your own filter by specifying nodes of interest as command line parameters to joern_filter_asts and joern_filter_cfgs. Run joern_filter_asts --help and joern_filter_cfgs --help for more information. Alternatively, you can design your own filters and row2string converters and place them in sourceutils/pythonASTFilter/pruning and sourceutils/pythonCFGFilter/pruning respectively. Take a look at the existing scripts in these directories for more information.
- Indexing
Run the following to create index files:
$ joern_index .$codebase
callIndex.pickl:
Python dictionary mapping the names of functions to the list
of locations where they are called.
conditionIndex.pickl
Python dictionary mapping conditions to the locations where
they are imposed.
declarationIndex.pickl
Python dictionary mapping the names of types to the locations
where they are used to declare a variable.
functionIndex.pickl:
Python dictionary mapping the names of functions to function
definitions with that name.
Take a look at sourceutils/codeIndex/CodeIndexCreator.py to see how simple it is to create these indexes based on the data in .$codedir. You can add any index you require to be generated in this file.
- Visualization
joern provides some very basic functionality to visualize abstract syntax trees and control flow graphs. This is mainly intended for debugging, i.e. to make sure that the filters you define generate the expected output.
For example, to plot a filtered CFG, run the following:
$ ./joern_plot filename/funcname/prunedCfg.pickl
Developed by: Fabian 'fabs' Yamaguchi (University of Goettingen)
Greetings: @trapflag, @nion, @mlsec, @teh_gerg, @sergeybratus, @joernchen