I have included my notes under note folder. My interpretation might be wrong so isseus and PR are welcome.
Here are some frameworks for program analysis. I strongly recommend you to implement algorithms on them.
LLVM
: theClang Static Analyzer
is a awesome tool to analyze C/C++ from AST level. AndLLVM Pass
provides a more low-level IR to analyze.Soot
: tools for Java program anlysis and optimizationBAP
: Binary Analysis Platform. Written in OCaml. We can inspect customize IRBIL
to analyze different binary.Angr
: Binary Analysis and Symbolic Exectuion
- Infer: Source code static analysis based on OCaml
- BAP based
- CWE-Checker: Binary analysis based on BAP
- CBAT TOOLS: This project implements VSA and WP on BAP
- Several papers related to BAP implementation: http://okmij.org/ftp/tagless-final/JFP.pdf http://www.cs.utexas.edu/~wcook/Drafts/2012/ecoop2012.pdf http://okmij.org/ftp/tagless-final/course/optimizations.html
- LLVM based
- Soot based
- TODO
- pfff: A tool set for source code analysis
- BinCat
- Angr: Binary Symbolic Executor
Normally, the basic parts include dataflow analysis frameword (reaching definition, interval analysis, …), pointer analysis (andreson and steensgaard), and abstract interpretation (sign analysis). You are also encouraged to learn discrete math to understand the notations in text books.
- Static Analysis(Abstract Interpretation Based):
- UW CSE501 Personally recommend, the contents are more compacted.
- CMU CS17-355: Some slides are missing. And the contents focus on security stuff more. Recommend to use notes here and slides from UW
- https://homepages.dcc.ufmg.br/~fernando/classes/dcc888/ementa/
- IOWA CS513X: The topic is about staitc analysis but slighlty more depth.
- MIT 16.399: Abstract Interpretation, in a more math way.
- Program Synthesis and Model Checking
- UW CSE599: Advanced Computer-Aided Reasoning for Software
- CMU CS15-414: Model Checking
- MIT 6.820: Focused on abstract interpreter and model checking.
- SMT:
- UW CSE 507: SMT solver and applications
- CMU 15816: Automated Reasoning and Satisfiability
- Books:
- SPA Book: Personally recommend. This book is static analysis oriented. It also provied a toy language analyszer. The psedu-code and syntax are better the PPA.
- Principle of Program Analysis: Old school book. The syntax is abstract. Might be too hard to understand.
- Abstract Interpretation: Past, Present and Future: Check out the references, surprise!
- Abstract Interpretation Course: 16.399 by MIT
- ABSTRACT INTERPRETATION : ‘A UNIFIED LATTICE MODEL FOR STATIC ANALYSIS OF PROGRAMS BY CONSTRUCTION OR APPROXIMATION OF FIXPOINTS: The original paper.
- Wiki about AI
- Systematic Design of Program Analysis Frameworks
- Meta Abstract Interpretation: Use abstract interpretation to analyze static analyzer (I guess somehow simillar to AAM)
- Constructive Galois Connections
- Abstracting Abstract Machine Series:
- Systematic abstraction of abstract machines: How to write an AAM in Haskell.
- Abstracting Abstract Machines: Using CFA to abstract the behavior of an abstract machine for static analysis.
- Context-, Flow-, and Field-Sensitive Data-Flow Analysis using Synchronized Pushdown Systems: Best paper!
- Precise Interprocedural Dataflow Analysis via Graph Reachability: The IFDS framework
- Weighted Pushdown Systems and their Application to Interprocedural Dataflow Analysis: An enhanced version of IFDS
- Precise interprocedural dataflow analysis with applications to constant propagation: IDE framework, IDFS’s cute cousin with more all-purpose but more complexity as well.
- The Value Flow Graph: A Program Representation for Optimal Program Transformations]: Value Flow Graphs represent semantic equivalence of terms syntactically. This allows us to feed the knowledge of semantic equivalence into syntactic algorithms.
- Deterministic Parallel Fixpoint Computation
- Inclusion Based:
- Steengard & Andreson: Learn than from intro courses
- Binary Decision Tree Based:
- Probability Based:
- Making Context-sensitive Points-to Analysis with Heap Cloning Practical For The Real World
- Shape Analysis by WISC: Introduce shape analysis for heap
- Shape Analysis and Applications by UT
- Automated Verification of the Deutsch-Schorr-Waite Tree-Traversal Algorithm
- Program Slicing Handbook: For some values your analyzer intereted, we can slice the program to find related part of the program which impacts thos values.
- Program Tailoring: Slicing by Sequential Criteria
- Value Set Analysis:
- Analyzing Memory Accesses in x86 Executables: Introduce value-set analysis. This analysis uses an abstract domain for representing an over-approximation of the set of values that each data object can hold at each program point.
- Improved Memory-Access Analysis for x86 Executables: An improved version of VSA.
- DEEPVSA: Facilitating Value-set Analysis with Deep Learning for Postmortem Program Analysis: VSA + ML
- MicroX:Emulator for testing without input: A novel framework to emulate program without input for bug finding.
- Decompile (or Binary Translation):
- Reverse Compilation Techniques: This book is awesome, all about decompilation from frontend to backend.
- TIE: Principled Reverse Engineering of Types in Binary Programs: Recover types from a program based on type lattice induction.
- Polymorphic Type Inference for Machine Code: Another thesis for type infer in binary code.
- mcsema: Translate binary to LLVM bytecode.
- Software Foundation: Proof Assitance all in one
- Learning Material:
- Framework:
- spin: Model Checker for multi-thread software
- Java Path Finder: JPF is an extensible software analysis framework for Java bytecode.
- Safety
- Taint Analysis:
- Exploitation:
- Fuzz:
- Demand Interprocedural Program Analysis Using Logic Databases
- Static Program Analysis via 3-Valued Logic
- Programming Z3
- SAT Handbook
- The Calculus of Computation: All about SMT fundamental.
Abstract Machine primarirly discuss about the exact execution of a program.
- Abstract machines for programming language implementation
- Abstracting Definitional Interpreters: Solid foundation of semmantics
- Unleashing MAYHEM on Binary Code: How to structure a CRS, and new methods on symbolic execution
- Partial Evaluation