ramonhagenaars/nptyping

dtype NDArray based on enum values

giokara-work opened this issue · 3 comments

I'm trying to limit the values of an entry in an NDArray to those of an enum class.
Here is a minimal example that illustrates one of my tries until now:

import numpy as np
from typing import Any
from nptyping import NDArray
from nptyping import Shape
from enum import Enum

class MyEnum(Enum):
    ZERO = 0
    ONE = 1
    TWO = 2

def my_function() -> NDArray[Shape['Any'], MyEnum]:
    return np.array([MyEnum.ZERO] * 2 + [MyEnum.ONE] * 3 + [MyEnum.TWO] * 4)

The line in question is:

def my_function() -> NDArray[Shape['Any'], MyEnum]:

With versions of nptyping < 2 the code above would pass lint checks. Now the following error is thrown:

_____________ ERROR collecting test.py ______________
test.py:12: in <module>
    def my_function() -> NDArray[Shape['Any'], MyEnum]:
../pip_deps_pypi__nptyping/nptyping/base_meta_classes.py:138: in __getitem__
    args = cls._get_item(item)
../pip_deps_pypi__nptyping/nptyping/ndarray.py:69: in _get_item
    shape, dtype = cls._get_from_tuple(item)
../pip_deps_pypi__nptyping/nptyping/ndarray.py:111: in _get_from_tuple
    dtype = cls._get_dtype(item[1])
../pip_deps_pypi__nptyping/nptyping/ndarray.py:148: in _get_dtype
    f"Unexpected argument '{dtype_candidate}', expecting"
E   nptyping.error.InvalidArgumentsError: Unexpected argument '<enum 'MyEnum'>', expecting Structure[<StructureExpression>] or Literal[<ShapeExpression>] or a dtype or typing.Any.

The documentation provides a list of possible dtypes. In addition, the Structure keyword can be used to build more complex data structures. Is there also some way to restrict values based on an Enum, since the representation of every option is an integer?

I'd also be happy to use something Integer[Values[0, 1, 2]] or Set[0, 1, 2] in the case above.

What exactly are you trying to express? Do I understand it correctly that you mean to express an ndarray with some literal values such as zeroes [0, 0, 0] or ones [1, 1, 1]? If this is the case, then I think the best thing Shape currently offers is labels:

>>> NDArray[Shape["N zeroes or ones or twos"], Int32]
NDArray[Shape['N zeroes or ones or twos'], Int]

The isinstance check will not check for the values though, but it won't break either.

It's not so much about the shape of the array, as the values in the array. We would like to limit the values inside the NDArray to these of a limited set. Not all values need to be the same.
The best workaround I've found for now is replacing

def my_function() -> NDArray[Shape['Any'], MyEnum]:

by

def my_function() -> NDArray[Shape['Any'], Int]:

which works, but ideally we would like to limit the possible values in the result to a set defined by us.

I see what you mean. This is at this moment not possible; you can express types and shapes, but not values. This would require a new construct.

I could imagine something like this:

NDArray[Any, Value["0 | 1 | 2"]] 

Would that express what you want?


Off-topic: Shape['Any'] may not express what you think it does. It expresses 1 dimension of any size, where you may wanted to express any shape at all. You could write Shape["*, ..."] or just typing.Any (not within Shape).