BorgwardtLab/P-WL

Invariance to node permutation?

spark157 opened this issue · 2 comments

Hello,

I'm playing around with the code in a Jupyter notebook and wasn't sure about the following result when I consider permuting the node indices of a simple graph. I expected that when I permute the node indices (ie. to isomorphic graphs) then I will get the same P-WL-C representation - but I'm not.

My code (after the necessary imports) is below and hopefully I permuted the graphs correctly (worth verifying first or my question is moot!).

Thanks so much for clarifying if you can.

Scott

# Quick double check to make sure that results are invariant

pwl = PersistentWeisfeilerLehman(
            use_cycle_persistence=True,
            use_original_features=False,
            use_label_persistence=True,
            store_persistence_diagrams=True,
            metric='minkowski',
            p=1,
            smooth=False)

# Graph taken from Figure 2 of Shervashidze, 2011
graph1 = ig.Graph([(0,2),(1,2),(2,3),(2,4),(3,4),(3,5),(4,5)])
graph1.vs['label'] = [1,1,4,3,5,2]

# Permute the same graph for different node ordering
# (0 5)(1 4)(2 3)
graph2 = ig.Graph([(0,1),(0,2),(1,2),(1,3),(2,3),(3,4),(3,5)])
graph2.vs['label'] = [2,5,3,4,1,1]

# Permute again for a third graph: (2 3)
graph3 = ig.Graph([(0,3),(1,3),(2,3),(2,4),(2,5),(3,4),(4,5)])
graph3.vs['label'] = [1,1,3,4,5,2]

graphs = [graph1,graph2,graph3]

pwl.transform(graphs, 1)

Output:

(array([[ 4.,  2.,  2.,  2.,  2.,  0.,  0.,  2.,  4.,  2., 14.,  7.,  4.,
          5.,  7.,  0.,  7.,  0., 12.,  5.],
        [ 4.,  2.,  2.,  2.,  2.,  0.,  2.,  4.,  2.,  0., 14.,  7.,  7.,
          4.,  5.,  0.,  7., 12.,  0.,  5.],
        [ 4.,  2.,  2.,  2.,  2.,  0.,  2.,  0.,  4.,  2., 14.,  7.,  4.,
          5.,  7.,  0.,  7.,  0., 12.,  5.]]), {0: 10, 1: 10})

Dear Scott,

thanks for reaching out! We will definitely check this as soon as possible and get back to you. P-WL should be impervious to permutations, but the underlying persistent homology calculation sometimes needs to make a choice between two equivalent nodes---this might be a potential error source in our code.

Dear Scott,

sorry for taking so long to answer. I looked at the case and this is indeed caused by a consistency check in the persistence calculation: when going through the edges of a filtration, we have to select one index of an edge. Currently, this selection is done following the 'elder rule' in persistent homology, i.e. we always take the smaller index. This leads to an inconsistent mapping of labels.

The behaviour is different for P-WL-UC because in this case, the same persistence value will be assigned to all tuples (and the persistence diagrams of the graphs that you added are indeed identical). Thank you very much for raising this issue and thanks for your patience; I think to fix this, we would have to consider distributing the persistence values among the labels in another manner; or, failing that, another scheme for calculating persistent homology based on labels.

Would you be interested if we keep you up to date concerning this issue?

Best,
Bastian