palash1992/GEM

Error making link prediction

Closed this issue · 5 comments

Hi,when I modified test.kerate.py to make link prediction, the following error occurred:
Traceback (most recent call last):
File "test_karate.py", line 54, in
MAP, prec_curv = lp.evaluateStaticLinkPrediction(G, embedding,0.8)
File "/home/public/anaconda3/lib/python3.6/site-packages/gem-1.0.0-py3.6.egg/gem/evaluation/evaluate_link_prediction.py", line 24, in evaluateStaticLinkPrediction
is_undirected=is_undirected
File "/home/public/anaconda3/lib/python3.6/site-packages/gem-1.0.0-py3.6.egg/gem/utils/evaluation_util.py", line 50, in splitDiGraphToTrainTest
train_digraph.remove_edge(ed, st)
File "/home/public/anaconda3/lib/python3.6/site-packages/networkx/classes/digraph.py", line 692, in remove_edge
raise NetworkXError("The edge %s-%s not in graph."%(u,v))
networkx.exception.NetworkXError: The edge 31-0 not in graph.
I modified it as follows:
MAP, prec_curv = lp.evaluateStaticLinkPrediction(G, embedding,0.8)
I would appreciate it if you could help me!

As the graph is directed, please set is_undirected=False

hi,Palash Goyal
When I modify isdirected=False, there are no errors, I get the results as follows:
An algorithmic framework for representational learning on graphs. [Apr 9 2017]

Input graph path (-i:)=tempGraph.graph
Output graph path (-o:)=tempGraph.emb
Number of dimensions. Default is 128 (-d:)=2
Length of walk per source. Default is 80 (-l:)=80
Number of walks per source. Default is 10 (-r:)=10
Context size for optimization. Default is 10 (-k:)=10
Number of epochs in SGD. Default is 1 (-e:)=1
Return hyperparameter. Default is 1 (-p:)=1
Inout hyperparameter. Default is 1 (-q:)=1
Verbose output. (-v)=YES
Graph is directed. (-dr)=YES
Graph is weighted. (-w)=YES
Read 126 lines from tempGraph.graph
Preprocessing progress: 0.00%
Walking Progress: 0.00%
Learning Progress: 75.76%
MAP: 0.2912400920513133 preccision curve: [0.0, 0.0, 0.0, 0.0, 0.0]
But I don't think it's right.
At present, I have my own way of network representation learning, and it has been able to represent my own network data as vectors. Now I want to make link prediction to evaluate its effect.However, I don't understand link prediction very well.
When evaluate the Link Prediction,What is the data format it input? Is it source target?Does it need to be labeled?

The output above is correct.

The code takes the embedding method class object, graph and the obtained vector as inputs. It doesn't need to be labeled.

Is this input the vector of all the nodes, or the vector of the nodes in the training data?

The nodes in the training data and test data are the same as the splitting is done on edges, not nodes.