Simple type checking
Opened this issue · 4 comments
We want to investigate a simple case of variability-aware type checking.
Here is a paper which outlines the typechef type system:
http://www.cs.cmu.edu/~ckaestne/pdf/fse13.pdf
Here is a simple example which I think would be useful to target:
#ifdef A
int i;
#else
int *i;
#endif
void foo() {
i = 1;
}
I consider this example simple since it only uses fundamental types (and a pointer to a fundamental type). It's actually kinda hard to create a type error in c++ using only fundamental types since there's so much automatic casting that can happen.
Note: assigning an int to an enum variable also causes an error.
Possible Solution
Here I'll outline a possible way to approach the above example. I will focus on what should happen when the statement i = 1;
is parsed and type checked.
The normal version of clang produces an ast which looks something like this:
|-VarDecl 0x5572b9b6c848 <simple_type_checking.cpp:6:1, col:5> col:5 used i 'int'
`-FunctionDecl 0x5572b9b6c948 <line:9:1, line:11:1> line:9:6 foo 'void ()'
`-CompoundStmt 0x5572b9b6ca48 <col:12, line:11:1>
`-BinaryOperator 0x5572b9b6ca28 <line:10:4, col:8> 'int' lvalue '='
|-DeclRefExpr 0x5572b9b6c9e8 <col:4> 'int' lvalue Var 0x5572b9b6c848 'i' 'int'
`-IntegerLiteral 0x5572b9b6ca08 <col:8> 'int' 1
Note that the statement i = 1
; is represented by a BinaryOperator
with the left hand side represented by a DeclRefExpr
and the right hand side represented by an IntegerLiteral
.
The way that clang does its type checking is roughly as follows:
- The
Parser
class parses an expression and calls aSema
method usually namedActOnX
where X is the type of expression that will eventually be built. - The
ActOnX
method may convert something from theParser
's representation to theSema
class's representation of the same thing. Then this method will call another Sema method namedBuildX
. - A method of the Sema class called
BuildX
will do most of the work of type checking the subexpressions and (if type checks pass) eventually construct an AST node of type X.
See here for more information on type checking the clang AST: https://clang.llvm.org/docs/InternalsManual.html#semantic-handling
In the case of an expression like i = 1
, I believe the relevant Sema
methods are BuildDeclarationNameExpr
and ActOnBinop
.
The BuildDeclarationNameExpr
method checks to see if a name like i
refers to a valid declaration. This method needs to be modified to accept a VariantDecl
since the lookup of i
returns a VariantDecl
. It should iterate through every individual Decl
contained in the VariantDecl
and ensure that it is a valid target for assignment.
Then the method ActOnBinop
should be modified to make every possible lhs is compatible with the type of expression (in this case an integer) on the rhs.
Generalizing for any binary operator
To extend this approach to other types of expressions, we will likely have to modify many ActOnX
or BuildX
methods to type check every variant. This will ensure that the operands of a binary operator are type checked properly.
Then in the ActOnBinop
method, we will have to check every possible combination of lhs and rhs for compatibility. This will ensure the entire binary operation is type checked in every configuration. Note that in the above example, only the lhs had variability but in the general case, the rhs could have variability too.
Note that this will result in checking the cartesian product of the lhs and rhs types. As mentioned in this typechef paper: http://www.cs.cmu.edu/~ckaestne/pdf/fse13.pdf, this is unavoidable.
In commit db79d8c, I modified Sema::TryAndResolveContextualAmbiguity
so that it always returns a VariantDecl
when more than one declaration is found during lookup.
This is causing Sema::BuildDeclarationNameExpr
to report an error since a VariantDecl
does not dyn_cast
to a ValueDecl
. This will need to be handled by iterating over each Decl
within the VariantDecl
and perform all the checks on these individual Decl
s instead of the whole VariantDecl
.