ramonhagenaars/nptyping

Wildcard ellipsis ... matching incorrect?

oliver-batchelor opened this issue · 3 comments

As expected

>>> isinstance(random.randn(3, 2, 55), NDArray[Shape["3, *, ..."], Any])
True  

These two I would expect the ellipsis to match the trailing dimensions - but they don't.

>>> isinstance(random.randn(3, 2, 55), NDArray[Shape["3, 2, ..."], Any])
False  

>>> isinstance(random.randn(3, 2, 55), NDArray[Shape["3,  ..."], Any])
False

Then finally, the ellipsis must only exist at the end.

>>> isinstance(random.randn(3, 2, 55), NDArray[Shape[" ..., 55"], Any])
nptyping.error.InvalidShapeError: '..., 55' is not a valid shape expression.

Am I just failing to understand how the ellipsis is used here? Which as far as I can tell is the usual usage in terms of array indexing, where it can match zero or more dimensions.. for example, these are all valid numpy indexing.

x = random.randn(2, 3, 55)
>>> x[1,...].shape
(3, 55)
>>> 
>>> x[1,...].shape
(3, 55)
>>> x[1, 1, ...].shape
(55,)
>>> x[1, 1, 1, ...].shape
()
>>> x[..., 1, 1].shape
(2,)
>>> x[1, ...,  1].shape
(3,)

The ellipsis in an nptyping shape expression means as much as "and so forth". E.g. Shape["3, 2, ..."] describes an array with one dimension of size 3 and one or more dimensions of size 2.

The wildcard (*) can be used to express a dimension of any size. So the shape of random.randn(3, 2, 55) would match against Shape["3, 2, *"]. Combining the ellipsis with the wildcard allows you to annotate any dimension of any size.

For more examples, see the documentation.

It is indeed true that the ellipsis is only allowed at the end of a shape expression. The reason for this, is that it would take significantly more effort to implement while not adding much value. What use case would benefit from an expression like Shape[3, ..., 2]? And what about Shape[3, ..., 2, ..., 1]?

No more recent activity: closing.

I would like to put in a vote for re-opening this. My use-case is for arrays with any number of dimensions followed by 1-2 final dimensions. For a complicated example, you could have an array like:

[batch_size, image_height, image_width, 3x4 projection matrix]

I have functions that only care about the last two dimensions, so, in the functions, I would like to type this like:

NDArray[Shape["..., 3, 4"]

so it is compatible with any array where array.shape[-2:] == (3, 4). (I think that *, ... is not suitable because it implies one or more dimensions, not zero or more.)