Can you explain better what Add_Self_Loops and expand_edge_index do exactly?

Hi,

I'm trying to understand what your code do. Let's say that we have a toy dataset like this:
V;E =
v0, v0, v1, v1, v2, v2, v3
e1, e2, e2, e1, e2, e3, e3

Which translate into this edge_index:
[[0, 0, 1, 1, 2, 2, 3],
[4, 5, 5, 4, 5, 6, 6]],

import numpy as np

edge_index = torch.tensor([[0, 0, 1, 1, 2, 2, 3], [4, 5, 5, 4, 5, 6, 6]], dtype=torch.long)
edge_index = coalesce(edge_index)

ei = np.array(edge_index)
num_nodes = len(np.unique(ei[0]))
num_hyperedges = len(np.unique(ei[1]))

data = Data(edge_index=edge_index, n_x = num_nodes, num_hyperedges=num_hyperedges)

Now, if I use your Add_Self_Loops code I obtain this edge_index:
tensor([[ 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3]
[ 4, 5, 7, 4, 5, 8, 5, 6, 9, 6, 10]])

Which is a little bit confusing. Shoudn't be added the same node? What are you doing in this code exactly, and why?

Same question for expand_edge_index.

tensor([[ 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3]
[ 5, 7, 8, 4, 6, 8, 6, 7, 10, 9, 11]])

What kind of expansion are you implementing? What is the difference between the two implementations? Why are they needed for your models AllDeepSets and AllSetTransformer?

Hi @giuliacassara ,

For Add_self_loop(), we are assigning a new "unique" hyperedge_id for each node acts as the self-loops in the standard graph case. In your example

[[0, 0, 1, 1, 2, 2, 3],
[4, 5, 5, 4, 5, 6, 6]],

since the "real" hyperedge_id is from 4 to 6, we assign node 0,1,2,3 with new hyperedges that link to each of them only, which is hyperedge_id 7,8,9,10 in the example that you pointed out. That is, the hyperedge 7 will only associate with node 0, 8 with 1, and so on. Hence, the hyperedges 7 to 10 are indeed self-loops.

For expand_edge_index(), please check our description in

AllSet/src/preprocessing.py

Line 22 in 0d0e399

def expand_edge_index(data, edge_th=0):

We will only need it if we want to "exclude" the self contribution in the message-passing in hypergraphs. This is for the formulation of the equation (6) in our paper only and it is not used in our presented experiments if I remember correctly.

Please let me know if you have further questions.

Best,
Eli

Hi,
everything clear, thank you!