This is (will be) a tool to build a callgraph
from a number of source files.
It relies heavily on other python libraries
(pycparser
, ast
)
to do the heavy-lifting
related to parsing the code.
We will start targeting C
(using pycparser
to get the AST)
but keeping in mind that
we would like also to deal with Python.
These are the steps.
-
Get the AST of every compilation unit.
For C:
-
Preprocess compilation unit. This requires setting up all necessary
-D
s, include paths, and also clearing all the non-standard stuff (e.g.__attribute__
for GCC). For the include paths, remember to use the mock oflibc
that comes included in thepycparser
repo.NOTE: preprocessing the files so that
pycparser
does not choke on them is usually the most tedious part. -
Run
pycparser
and get the AST.
For Python:
- Read in memory
the content of the source file.
Then parse it using
the facilities contained in the
ast
module of the standard library.
-
-
Exctract the function definition/function call information of every module/compilation unit. Notice that the name of the function is not enough to determine exactly the function, also the name of the compilation unit is necessary (e.g., for the
main()
function). -
Simulate the linking phase. For C, one needs to specify which files are linked together. For python: TODO
It is not yet clear how to exactly do things for python.