Psyche is a compiler frontend for the C programming language. Psyche-C is specifically designed for the implementation of static analysis tools. These are the distinct features that make Psyche-C rather unique:
- Clean separation between the syntactic and semantic compiler phases.
- Both algorithmic- and heuristic-based syntax disambiguation strategies.
- Independent of
#include
, with type inference for missingstruct
,union
,enum
, andtypedef
. - API inspired by that of the Roslyn .NET compiler.
- A parser's AST resembling that of the LLVM's Clang frontend.
Applications:
- Enabling, on incomplete source-code, static analysis techniques that require fully-typed programs.
- Compiling partial code (e.g., a snippet retrieved from a bug tracker) for object-code inspection.
- Generating test-input data for a function in isolation (without its dependencies).
- Quick prototyping of an algorithm, without the need of explicit types.
NOTE: The master branch is going through a major overhaul; it's expected that syntax analysis (parsing and AST construction) already is functional, though. The original version of Psyche-C is available in this branch.
While Psyche-C is primarily used as a library for the implementation of static analysis tools, it still is a compiler frontend, and may also be used as an ordinary C parser through the cnippet driver adaptor.
// node.c
void f()
{
T v = 0;
v->value = 42;
v->next = v;
}
If you compile the above snippet with GCC or Clang, you'd see the diagnostic "declaration forT
is not available". But with cnippet the compilation would succeed: a definition for T
is inferred.
- The Doxygen-generated API.
- HOW-TO blog posts:
- An online interface that offers a glimpse of Psyche-C's functionality.
- Articles:
Except for type inference, which is written in Haskell, Psyche-C is written in C++17; cnippet is written in Python 3.
To build:
cmake CMakeLists.txt && make -j 4
To run the tests:
./test-suite
-
Type Inference for C: Applications to the Static Analysis of Incomplete Programs
ACM Transactions on Programming Languages and Systems — TOPLAS, Volume 42, Issue 3, Article No. 15, Dec. 2020. -
Inference of static semantics for incomplete C programs
Proceedings of the ACM on Programming Languages, Volume 2, Issue POPL, Jan. 2018, Article No. 29. -
AnghaBench: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization — CGO, 2021. -
Generation of in-bounds inputs for arrays in memory-unsafe languages
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization — CGO, Feb. 2019, p. 136-148. -
Automatic annotation of tasks in structured code
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques — PACT, Nov. 2018, Article No. 31.