pathpy/pathpyG

How should we handle `num_nodes`?

M-Lampert opened this issue · 3 comments

Currently the method MultiOrderModel.lift_order_edge_index(...) requires num_nodes as argument. If you set this incorrectly, you will get an error as follows:

IndexError                                Traceback (most recent call last)
Cell In[33], line 10
      7 m = pp.MultiOrderModel.from_DAGs(toy_paths, max_order=max_order, mode = "propagation")
      9 edge_index = m.layers[1].data.edge_index
---> 10 m.lift_order_edge_index(edge_index, edge_index.shape[1])

File /workspaces/pathpyG/src/pathpyG/core/MultiOrderModel.py:64, in MultiOrderModel.lift_order_edge_index(edge_index, num_nodes)
     61 outdegree = degree(edge_index[0], dtype=torch.long, num_nodes=num_nodes)
     62 # Map outdegree to each destination node to create an edge for each combination
     63 # of incoming and outgoing edges for each destination node
---> 64 outdegree_per_dst = outdegree[edge_index[1]]
     65 num_new_edges = outdegree_per_dst.sum()
     66 # Create sources of the new higher-order edges

IndexError: index 3 is out of bounds for dimension 0 with size 3

This happened to @VincenzoPerri and will probably also happen to other users in the future. We should think about doing some kind of check and a warning or setting a default for easier usability.

The function Graph.from_edge_index now accepts a parameter num_nodes

I do not quite follow at which point this issue occurs, i.e. where the specification of num_nodes in the MultiOrderModel class is missing. This does not seem to be related to the Graph class?

Could you create a minimal example that reproduces the error, so I can debug this?

The issue was here:

def lift_order_edge_index(edge_index: torch.Tensor, num_nodes: int) -> torch.Tensor:

MultiOrderModel.lift_order_edge_index(...) requires the num_nodes parameter and if it is set correctly, then everything works as expected. But @VincenzoPerri 's problem was that he specified the number of edges instead. I took this as a precedent that the current implementation is not as user-friendly as it could be. But I am unsure of what to change.