Entity types
Closed this issue · 3 comments
In Cypher, node and edge types are represented by :ColonNotation
. For example,
(A:Neuron)-[AB:Synapse]->(B:Neuron)
NetworkX has no concept of entity "types," so this will be the first time that this codebase mandates a data schema (i.e., a type
attribute on the entities in the graph). I'm not sure this is something I want to enforce, but if we do decide to use vertex/edge attributes like this, I'd like to open discussion in this issue to establish what schema we want to support.
I think somehow type serves as a kind of index. The graph search engine can leverage this to search for a subset of nodes/edges instead of searching for all nodes. Our algorithm does not benefit from this just yet. But it's nice to have, maybe for the sake of being cypher (?).
If it is to be done, I recommend it is to be stored under the __type__
property. Why two-end double dashes? it more adheres to the python convention and leaves room for other usages. It appears to be intimidating for users to construct a graph by themselves following this convention. But it will go away as soon as graph mutation is supported.
Oh I like __type__
__label__
, that seems like the right move for sure! Good call. We can also perhaps have a few utilities to easily assign labels before adding proper support for mutations, like (just a sketch, I don't feel strongly about this API in particular)
from grandcypher import assign_labels
g_with_labels = assign_labels(g, assignments)
where assignments
can be a dict or callable:
assignments = {"a": "Customer", "b": "Store", "c": "Product"}
# or:
assignments = lambda x: x.split(":")[1:] # for node IDs of format "Jordan:Customer"
One thing to keep in mind is that objects can have more than one entity label assigned... So __label__
may have to be a complex dtype like set
instead of a simple str
.
[EDIT] Went back and changed "types" to "labels" to match cypher terminology.