microsoft/ifc-spec

Rethink source locations in the IFC to accommodate implementations which need richer source-level information

Opened this issue · 2 comments

It was expressed by @ChuanqiXu9 in the last module implementors meeting that both Clang and GCC have a richer representation of source locations than was initially required from MSVC (the early adopter of the IFC). One example cited was declarations:

    int i = 0;
//  ^        ^
//  |          | Decl end
//  | Decl begin

The other vendors have a span of source locations for declarations in which represent the declaration node as written in source. In Clang it is represented as DeclStmt <line:2:5, col:14> where the span is assumed to start at column '5' and end at column '14'.

In the IFC, today we only have a single source location for nearly every node but this representation appears to be insufficient for the needs of other compiler implementations. We should rethink this or provide a richer source location structure which could allow for ranges of locations.

EDG also needs the total line count of a given file to avoid traversing all source location information when an IFC is first loaded.

Thanks. For reference, clang will give more source location encodings for different decls and stmts. For example, for if-stmt: https://github.com/llvm/llvm-project/blob/0e1a52f556a90cc7b7ce7666fc476c99cf7bfb02/clang/lib/Serialization/ASTWriterStmt.cpp#L159-L161.

Although not formally documented, we can find the recorded source locations for decls and stmts in https://github.com/llvm/llvm-project/blob/main/clang/lib/Serialization/ASTWriterDecl.cpp and https://github.com/llvm/llvm-project/blob/main/clang/lib/Serialization/ASTWriterStmt.cpp by searching AddSourceLocation in these 2 files.