miso-belica/sumy

power_method produces NaN, inf values

FlxB2 opened this issue · 1 comments

FlxB2 commented

Hi, I noticed that power_method for LexRankSummarizer and TextRankSummarizer may produce NaN, inf values for some input values. I am not entirely sure if that matters for your use case, but to me it seems weird, because other implementations don't seem to have this problem.

For example the testcase here uses the matrix:

matrix = numpy.array([
    [0.1, 0.2, 0.3, 0.6, 0.9],
    [0.45, 0, 0.3, 0.6, 0],
    [0.5, 0.6, 0.3, 1, 0.9],
    [0.7, 0, 0, 0.6, 0],
    [0.5, 0.123, 0, 0.111, 0.9],
])

and LexRankSummarizer.power_method(matrix, LexRankSummarizer.epsilon) returns the value [inf inf nan inf inf]

Another example is the following matrix:

matrix = numpy.array([
    [0.1,0.2,0.3],
    [0.2,1,0.5],
    [0.3,0.5,0.6]
])

It returns: [inf inf inf].

The implementation I found here gives [0.2, 0.5666666666666667, 0.4666666666666667]

I used Python 3.9.16 and numpy 1.24.1

Fixed in #194 Thank @AryazE 🙂