`reduce_mean` with zero-length sequence fails on NumPy backend

Question

`reduce_mean` with zero-length sequence fails on NumPy backend

connorbrinton opened this issue a year ago · 2 comments

How to reproduce the behaviour

The reduce_mean layer docs state:

Pooling layer that reduces the dimensions of the data by computing the average value of each feature. Zero-length sequences are reduced to the zero vector.

The reduce_mean layer delegates almost directly to the reduce_mean op, with no handling of zero-length sequences. However, the reduce_mean NumPy op raises an assertion error when given a zero-length sequence (either in the data or lengths of the ragged object).

Here's an example that demonstrates the issue:

from thinc.api import get_current_ops, reduce_mean, use_ops
from thinc.types import Ragged

with use_ops("numpy"):
    ops = get_current_ops()
    ragged = Ragged(ops.alloc2f(0, 0), ops.alloc1i(0))
    reduce_mean()(ragged, is_train=False)

I haven't checked to see if the Cupy backend has the same error or not.

Your Environment

Operating System: macOS 13.5
Python Version Used: 3.9.16
Thinc Version Used: 8.1.10
Environment Information: Poetry virtual environment, M1 mac, ARM installation

Answer 1 · 2023-08-07T07:56:37.000Z

Thanks for reporting this! We have a PR for this issue (#882), but we haven't decided yet what the preferred behavior is (returning arrays with a zero-dimensions or raising an exception).

Answer 2 · 2023-08-07T11:30:24.000Z

#882 merged :-)