scikit-hep/awkward-0.x

JaggedArray.argmax() returns multiple values for some events

benkrikler opened this issue · 4 comments

I'm having an issue with argmax for version 0.12.5 (current version on pypi).

The following snippet demonstrates the problem:

masses = [[983, 715, 796, 213, 702, 789, 404, 319, 290, 346, 693, 771, 308, 337,
            236, 274, 180, 322, 361, 157, 687, 769, 317, 321, 231, 282, 148, 319,
            367, 144, 189, 275, 303, 137, 62, 928, 983, 171, 726, 177, 206, 715,
            380, 403, 312, 702, 295, 297, 254, 338, 698, 299, 312, 255, 339, 290],
           [705, 884, 645, 377, 783, 476, 321, 738, 126, 316]]
masses = JaggedArray.fromiter(masses)
print("argmax", masses.argmax())
print("max", masses.max())
print("val of argmax", masses[selected])

which will print out the following and then die on the last line:

argmax [[0] [36 1]]
max [983 884]

While the max method works as expected, argmax seems to be confused, seemingly using the wrong starts and stops to define each event.

This might be related to issue #158 but looks distinct in that the result is greater than 1 entry in the resulting argmax array.

It's different from #158, and I just put in a correction in PR #177.

Speedy as ever :) Thank you very much!

Hi, I ran into the same issue. At first I thought this might help

idx = vector_sum.pt.argmax().pad(1,clip=1) #pick up the first of possible multiple values
vector_sum.pt[idx]

but it seems the indices are shifted so that an the superfluous indices move over into the next events. Not sure if this is covered by the fix

the fix in master works, but I wonder if that reveals some other inconsistency (would you expect this not to work, even thought I explicitly clip to 1)?

It probably is covered by the fix: the error was one of shifting superfluous events, rather than hiding them as unreachable data. Your example uses an argmax, so unless you're using the latest version, it's suspect.