semantic
is a Haskell library and command line tool for parsing, analyzing, and comparing source code.
In a hurry? Check out our documentation of example uses for the semantic
command line tool.
Table of Contents |
---|
Usage |
Language support |
Development |
Technology and architecture |
Licensing |
Run semantic --help
for complete list of up-to-date options.
Usage: semantic parse ([--sexpression] | [--json] | [--json-graph] | [--symbols]
| [--dot] | [--show] | [--quiet]) [FILES...]
Generate parse trees for path(s)
Available options:
--sexpression Output s-expression parse trees (default)
--json Output JSON parse trees
--json-graph Output JSON adjacency list
--symbols Output JSON symbol list
--dot Output DOT graph parse trees
--show Output using the Show instance (debug only, format
subject to change without notice)
--quiet Don't produce output, but show timing stats
Usage: semantic diff ([--sexpression] | [--json] | [--json-graph] | [--toc] |
[--dot] | [--show]) [FILE_A] [FILE_B]
Compute changes between paths
Available options:
--sexpression Output s-expression diff tree (default)
--json Output JSON diff trees
--json-graph Output JSON diff trees
--toc Output JSON table of contents diff summary
--dot Output the diff as a DOT graph
--show Output using the Show instance (debug only, format
subject to change without notice)
Usage: semantic graph ([--imports] | [--calls]) [--packages] ([--dot] | [--json]
| [--show]) ([--root DIR] [--exclude-dir DIR]
DIR:LANGUAGE | FILE | --language ARG (FILES... | --stdin))
Compute a graph for a directory or from a top-level entry point module
Available options:
--imports Compute an import graph (default)
--calls Compute a call graph
--packages Include a vertex for the package, with edges from it
to each module
--dot Output in DOT graph format (default)
--json Output JSON graph
--show Output using the Show instance (debug only, format
subject to change without notice)
--root DIR Root directory of project. Optional, defaults to
entry file/directory.
--exclude-dir DIR Exclude a directory (e.g. vendor)
--language ARG The language for the analysis.
--stdin Read a list of newline-separated paths to analyze
from stdin.
Priority | Language | Parse | Assign | Diff | ToC | Symbols | Import graph | Call graph | Control flow graph |
---|---|---|---|---|---|---|---|---|---|
1 | Ruby | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | |
2 | JavaScript | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | |
3 | TypeScript | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | |
4 | Python | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | |
5 | Go | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | |
PHP | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Java | ✅ | ✅ | ✅ | 🔶 | ✅ | ||||
JSON | ✅ | ✅ | ✅ | N/A | N/A | N/A | N/A | ||
JSX | ✅ | ✅ | ✅ | 🔶 | |||||
Haskell | ✅ | ✅ | ✅ | 🔶 | ✅ | ||||
Markdown | ✅ | ✅ | ✅ | 🔶 | N/A | N/A | N/A |
- ✅ — Supported
- 🔶 — Partial support
- 🚧 — Under development
We use cabal's
Nix-style local builds for development. To get started quickly:
git clone git@github.com:github/semantic.git
cd semantic
git submodule sync --recursive && git submodule update --init --recursive --force
cabal new-update
cabal new-build
cabal new-test
cabal new-run semantic -- --help
semantic
requires GHC 8.6.4. We recommend using ghcup
to sandbox GHC versions. Our version bounds are based on Stackage LTS versions. The current LTS version is 13.13; stack
build should also work if you prefer.
Architecturally, semantic
:
- Reads blobs.
- Generates parse trees for those blobs with tree-sitter (an incremental parsing system for programmings tools).
- Assigns those trees into a generalized representation of syntax.
- Performs analysis, computes diffs, or just returns parse trees.
- Renders output in one of many supported formats.
Semantic leverages a number of interesting algorithms and techniques:
- Myers' algorithm (SES) as described in the paper An O(ND) Difference Algorithm and Its Variations
- RWS as described in the paper RWS-Diff: Flexible and Efficient Change Detection in Hierarchical Data.
- Open unions and data types à la carte.
- An implementation of Abstracting Definitional Interpreters extended to work with an à la carte representation of syntax terms.
Contributions are welcome! Please see our contribution guidelines and our code of conduct for details on how to participate in our community.
Semantic is licensed under the MIT license.