Unify FactGenerator/Datalog relation names
langston-barrett opened this issue · 2 comments
As of #40, the Fact Generator and Datalog code share a list of file/relation names. The Fact Generator refers to relations by group::rel
, e.g., variable::name
, whereas that corresponds to the Datalog relation variable_name
. While there is a clear correspondence between variable::name
and variable_name
, the relationship doesn't hold in general, e.g. we also have variable::id
corresponding to just variable
. Thus, predicates.inc
has lines like:
PREDICATE(global_var, unmangl_name, global_variable_has_unmangled_name)
where the first two entries describe the C++ (Fact Generator) name, and the third entry describes the filename/Datalog relation name. We should try to derive the latter from the former for the sake of consistency. This will involve changing a ton of Datalog code to use new relation names.
There is some urgency to this task: It's important to do this early in the git history if we're to do it at all. The git history of this project is fairly empty at this point, but someday will encode important choices about how the analysis was constructed.
For the sake of posterity, here are the major areas where they disagree, and how I'm approaching them in #45:
- Abbreviations, e.g.,
global_var
vs.global_variable
: I'm picking something clear but brief and documenting it in the dev docs - "infixes" such as
_has_
and_in_
: These are mostly redundant, so I'll remove them id
: The FactGenerator has e.g.variable::id
which corresponds to justvariable
in the Datalog. Not yet sure what to do about this.- Extra word separators: The FactGenerator uses
extract_element
to refer toextractelement
, and similarly for other multi-word opcodes. Prefer the opcodes as they appear in LLVM, i.e., as one word.