scikit-hep/awkward-0.x

IndexError when using MaskedArray and flatten

Closed this issue · 4 comments

jrueb commented

The following two lines raise an IndexError

a = awkward.MaskedArray([True, False], [1, 1])
print((a + 1).flatten())

This is however valid code, is it not? It seems there is a problem with the implementation of flatten for IndexedMaskedArray.

It does look like a bug—thanks for pointing this out. I might not be able to touch it in a couple of days, though. Note that this operation should have no effect—there's no jagged array to flatten, so this would just passed through.

(I got a chance to look at it anyway.) It looks like flattening a MaskedArray is defined in such a way as to eliminate all the None values. I don't think that's what the definition ought to be. It doesn't make sense for an operation to do such different things to lists and non-lists, particularly since the axis parameter only specifies which list depth it operates on; you can't control whether there are any masks at any level.

I'm going to fix it so that flatten just passes through MaskedArrays. If you want to remove None values, do a[a.boolmask(maskedwhen=False)] to use a MaskedArray's mask as a mask. (I also need to invent a new word to reduce redundancy in the terminology.)

After a bit more investigation, I see that this wasn't a bad definition—it was a bad implementation. In PR #209, I've fixed it. It should now work like this:

    def test_issue_208(self):
        a = awkward.MaskedArray([True, False, False, True, False, True], awkward.fromiter([[1, 2, 3], [4, 5], [6], [7, 8], [10, 11, 12], [999]]))
        assert a.flatten().tolist() == [None, 4, 5, 6, None, 10, 11, 12, None]
        assert (a + 100).flatten().tolist() == [None, 104, 105, 106, None, 110, 111, 112, None]
        a = awkward.MaskedArray([True, False, False, True, False, True], [1.1, 2.2, 3.3, 4.4, 5.5, 6.6])
        assert a.flatten().tolist() == [None, 2.2, 3.3, None, 5.5, None]
        assert (a + 100).flatten().tolist() == [None, 102.2, 103.3, None, 105.5, None]

It will probably be a day or so before I can merge this PR, but you can check it out now.

jrueb commented

Looks good! Thanks