Cytosm is an open-source software library for converting Cypher queries into plain old SQL on-the-fly.
Cytosm was originally developed by researchers and engineers working on the graph analytics on a variety of databases.
If you'd like to contribute to Cytosm, be sure to review the contribution guidelines.
We use GitHub issues for tracking, requests, and bugs.
You can easily use Cytosm as a library in your code (think of it as a simple mapping function that given a valid Cypher 'string' returns a valid 'SQL').
A Cypher string goes into several transformations in it journey through Cytosm:
- Parsing: It is an auto-generated ANTLR parser based on OpenCypher EBNF grammar. It creates an AST to be used later on.
- PathFinder: It navigates the AST, given a graph topology file (gTop), in order to make Cypher queries more concrete.
- Cypher2SQL: The module where all the magic happens.
Cytosm: Declarative Property Graph Queries Without Data Migration [Grades/2017]
A set of simple optimisations that aim to make Cypher queries more concrete. It prevents the mapper from exploring path patterns in the Cypher queries that are logically correct, but impossible in the light of the graph model used on top of the relational database.
With this tool, the mapping process is simpler and we make SQL queries more efficient.
See PathFinder
This module takes the concreted Cypher queries that the PathFinder module spits out and
- Analyses dependencies between Cypher variables and tracks their scope.
- Creates an intermediate language representation of the query (something closer to SQL, but not quite there yet). It is a hierarchical representation.
- From the hierarchy created in the previous stage, it builds a sequence of nested joins and unions in SQL to represent the graph patterns indicated in Cypher.
See Cypher2SQL
A description of the graph hiding in your relational database. It also includes how to map from abstract node/edges in the graph into specific database tables/columns.
Find more details about gTop
A gTop file can be automatically discovered by the "Graph Extraction" module (to be open-sourced soon).
Cytosm queries have been run on a variety of backends, obtaining quite surprising results. Please find more details in the sibling repo for Cytosm benchmarking.
- Directed relationships (see
PopulateJoins
pass for more information on the current status) - Arbitrary hops that could be defined in the gtop (similarly to the previous dot, more detail can be found in
PopulateJoins
) - Proper handling of the
COUNT
function (we only support limited use cases) SKIP
,LIMIT
andORDER BY
are not propagated appropriately on "wide" query - that is queries involving at least one UNION in the generated SQL.- Improve the type-checker to compute verify the correctness of any expressions before rendering. The current version is incomplete.
The following would be nice to have:
- Pattern expressions are not supported at all (the grammar does not even includes them). So things such as:
are not supported
MATCH (a), (b) WHERE NOT((a)--(b))
- Improve
CypherConverter
andPathFinder
to generate AST nodes instead of using intermediary string representations. - Improve the
PathFinder
related code to use the full information available about the variable and their type.