An implement for Efficient Subgraph Matching: Harmonizing DynamicProgramming, Adaptive Matching Order, and Failing Set Together (SIGMOD'19)
Deeply thankful to the code of In-Memory Subgraph Matching: An In-depth Study(SIGMOD'20)
-
Adjacency List for Graph
$[N_1(u_1),N_2(u1),...,N_1(u_2),N_2(u_2),...]$
where
by which both
-
Candidates
$[C_1(u_1),C_2(u1),...,C_1(u2),C_2(u_2),...]$
where
bfs tree of
the start vertex of bfs is defined by
1. StartVertex = argmin (|C(u)| / D(u))
2. BFSTree, BFSOrder = BFS(Q, StartVertex)
3. DAG = BuildDAG(BFSTree, BFSOrder)
forward filter: filter order is defined by BFS order
backward filter: filter order is defined by reversed BFS order
DP-iso alternately use forward and backward filter to prune candidates
Take forward filter as an example, DP-iso Filter try to prun all the
Support = ForwardNeighbor(u)->C(u)
Count = |ForwardNeighbor(u)|
if Support < Count
Prun(C(u))
-
Edge Matrix
$M$ for each
$M[u_1][u_2]$ :$[Bitset(C_1),Bitset(C_2),...]$ where
$Bitset(C_i)$ is the Bitset of$C(u_2)$ related to$C_i(u_1)$ Then given
$u_1,u_2,M$ , we can know if$C_i(u_1)$ is in the embedding, the valid candidates of$u_2$ is$M[u_1][u_2]$
DP-iso use a DP of weight for mathing order, where the weight is an estimation of the maxinum embedding of paths.
where
Since the valid candidates change when enumarating, the matching order in DP-iso is adaptive, ad we use a priority queue to record the extendable vertices ($W(C(u)$) for first,
DP-iso use a bitset for failing set, when enumerating/searching,
Searching at a leave node,
1. FailingSet = Empty, When success
2. FailingSet = Ancestor(u), When u has no candidates
3. FailingSet = Ancestor(u,v), When u,v confilct
For a non-leave node p,
1. FailingSet = Empty, if exist a Empty FailingSet in child node, when no failure exists
2. FailingSet = FailingSet(u), if child nodes u indicates the failure is not by p
3. FailingSet = Union(FailingSet(u)), for all chile nodes u, if child nodes u does not indicate the failure is not by p