PositionRank algorithm fails with networkx version 3.0: IndexError in `candidate_weighting()` method
tagucci opened this issue · 1 comments
tagucci commented
I encountered an error when I used the PositionRank algorithm.
While IndexError as below occured when calling extractor.candidate_weighting(), it worked correctly networkx==2.8.8
.
IndexError Traceback (most recent call last)
Cell In[6], line 15
12 extractor.candidate_selection()
14 # candidate weighting, in the case of TopicRank: using a random walk algorithm
---> 15 extractor.candidate_weighting()
17 # N-best selection, keyphrases contains the 10 highest scored candidates as
18 # (keyphrase, score) tuples
19 keyphrases = extractor.get_n_best(n=10)
File /home/pke/pke/unsupervised/graph_based/positionrank.py:171, in PositionRank.candidate_weighting(self, window, pos, normalized)
168 self.positions[word] /= norm
170 # compute the word scores using biased random walk
--> 171 w = nx.pagerank(G=self.graph,
172 alpha=0.85,
173 tol=0.0001,
174 personalization=self.positions,
175 weight='weight')
177 # loop through the candidates
178 for k in self.candidates.keys():
File /usr/local/lib/python3.10/dist-packages/networkx/classes/backends.py:134, in _dispatch.<locals>.wrapper(*args, **kwds)
132 @functools.wraps(func)
133 def wrapper(*args, **kwds):
--> 134 graph = args[0]
135 if hasattr(graph, "__networkx_plugin__") and plugins:
136 plugin_name = graph.__networkx_plugin__
IndexError: tuple index out of range
I've used minimal example in README switching exctractor to PositionRank.
import pke
extractor = pke.unsupervised.PositionRank()
extractor.load_document(input='text', language='en')
extractor.candidate_selection()
extractor.candidate_weighting()
keyphrases = extractor.get_n_best(n=10)
I suspect that the issue is related to networkx/networkx#6458. Set networkx==2.8.8
in requirements.txt or fix positionrank.py
will solve this problem.