A Concrete Syntax Tree (CST) parser and serializer library for Python
LibCST parses Python 3.0 -> 3.11 source code as a CST tree that keeps all formatting details (comments, whitespaces, parentheses, etc). It's useful for building automated refactoring (codemod) applications and linters.
LibCST creates a compromise between an Abstract Syntax Tree (AST) and a traditional Concrete Syntax Tree (CST). By carefully reorganizing and naming node types and fields, we've created a lossless CST that looks and feels like an AST.
You can learn more about the value that LibCST provides and our motivations for the project in our documentation. Try it out with notebook examples.
Example expression:
1 + 2
CST representation:
BinaryOperation( left=Integer( value='1', lpar=[], rpar=[], ), operator=Add( whitespace_before=SimpleWhitespace( value=' ', ), whitespace_after=SimpleWhitespace( value=' ', ), ), right=Integer( value='2', lpar=[], rpar=[], ), lpar=[], rpar=[], )
To examine the tree that is parsed from a particular file, do the following:
python -m libcst.tool print <some_py_file.py>
Alternatively, you can import LibCST into a Python REPL and use the included parser and pretty printing functions:
>>> import libcst as cst
>>> from libcst.tool import dump
>>> print(dump(cst.parse_expression("(1 + 2)")))
BinaryOperation(
left=Integer(
value='1',
),
operator=Add(),
right=Integer(
value='2',
),
lpar=[
LeftParen(),
],
rpar=[
RightParen(),
],
)
For a more detailed usage example, see our documentation.
LibCST requires Python 3.7+ and can be easily installed using most common Python packaging tools. We recommend installing the latest stable release from PyPI with pip:
pip install libcst
For parsing, LibCST ships with a native extension, so releases are distributed as binary wheels as well as the source code. If a binary wheel is not available for your system (Linux/Windows x86/x64 and Mac x64/arm are covered), you'll need a recent Rust toolchain for installing.
You'll need a recent Rust toolchain for developing.
Then, start by setting up and activating a virtualenv:
git clone git@github.com:Instagram/LibCST.git libcst
cd libcst
python3 -m venv ../libcst-env/ # just an example, put this wherever you want
source ../libcst-env/bin/activate
pip install --upgrade pip # optional, if you have an old system version of pip
pip install -r requirements.txt -r requirements-dev.txt
# If you're done with the virtualenv, you can leave it by running:
deactivate
We use ufmt to format code. To format changes to be conformant, run the following in the root:
ufmt format && python -m fixit.cli.apply_fix
We use slotscheck to check the correctness
of class __slots__
. To check that slots are defined properly, run:
python -m slotscheck libcst
To run all tests, you'll need to do the following in the root:
python -m unittest
You can also run individual tests by using unittest and specifying a module like this:
python -m unittest libcst.tests.test_batched_visitor
See the unittest documentation for more examples of how to run tests.
In order to build LibCST, which includes a native parser module, you
will need to have the Rust build tool cargo
on your path. You can
usually install cargo
using your system package manager, but the
most popular way to install cargo is using
rustup.
To build just the native parser, do the following from the native
directory:
cargo build
To build the libcst.native
module and install libcst
, run this
from the root:
pip uninstall -y libcst
pip install -e .
We use Pyre for type-checking.
To verify types for the library, do the following in the root:
pyre check
Note: You may need to run the pip install -e .
command prior
to type checking, see the section above on building.
To generate documents, do the following in the root:
sphinx-build docs/source/ docs/build/
- Advanced full repository facts providers like fully qualified name and call graph.
LibCST is MIT licensed, as found in the LICENSE file.
- Guido van Rossum for creating the parser generator pgen2 (originally used in lib2to3 and forked into parso).
- David Halter for parso which provides the parser and tokenizer that LibCST sits on top of.
- Zac Hatfield-Dodds for hypothesis integration which continues to help us find bugs.
- Zach Hammer improved type annotation for Mypy compatibility.