python/mypy

Support recursive types

o11c opened this issue Β· 83 comments

o11c commented

The following in particular would be useful:

Callback = Callable[[str], 'Callback']
Foo = Union[str, List['Foo']]

If (or when) mypy will have structural subtyping, recursive types would also be useful there.

My current plan is to postpone recursive types until simple structural subtyping is in and reconsider them later. After thinking about them more they are going to increase complexity a lot of I haven't seen much evidence for them being needed that often.

If I'm understanding this issue right, I'm running into it for classes that want a fluent interface. So for example, if I want callers to do something like this:

myfoo.add(bar).add(baz).finish()

Then the definition of the Foo class and the add method need to look something like this:

class Foo:
    def add(self, x) -> Foo:  # Python chokes on this line!
        # do stuff
        return self

Another place where Python commonly does return self is in the __enter__ method for context managers. Is mypy able to typecheck those right now?

@oconnor663 Try:

class Foo:
    def add(self, x) -> 'Foo':
        # do stuff
        return self

Ah, thank you.

@kirbyfan64, do you know if there are standard functions anywhere that understand this convention? Like, if I wanted to introspect a couple functions and compare the types of their arguments, should I handle the Foo == "Foo" case explicitly? That seems doable, but a string-aware version of say isinstance seems harder.

@oconnor663 I don't think there's anything like that. If you're introspecting the functions via a decorator, you could try accessing the caller's globals and locals.

You're aware of typing.get_type_hints(obj)right? It is similar to obj.annotations` but expands forward references.
https://docs.python.org/3/library/typing.html?highlight=typing#typing.get_type_hints

There used to be an instance() implementation in typing.py but Mark Shannon
made me take it out. It's being deleted in this rev:
python/typing@ac7494f

On Fri, Oct 16, 2015 at 9:37 AM, Ryan Gonzalez notifications@github.com
wrote:

@oconnor663 https://github.com/oconnor663 I don't think there's
anything like that. If you're introspecting the functions via a decorator,
you could try accessing the caller's globals and locals.

β€”
Reply to this email directly or view it on GitHub
#731 (comment).

--Guido van Rossum (python.org/~guido)

@gvanrossum that's exactly what I was looking for, thanks. Sorry for the n00b questions today, but awesome that all this is supported.

Mypy should detect missing string literal escapes (see #948).

Going back to the original point on the issue, I found a case in the stdlib where this would be needed; the type for isinstance() is currently:

def isinstance(o: object, t: Union[type, Tuple[type, ...]]) -> bool: ...

but it should actually be:

ClassInfo = Union[type, Tuple['ClassInfo', ...]]
def isinstance(o: object, t: ClassInfo) -> bool: ...

Because according to https://docs.python.org/3/library/functions.html#isinstance the tuples can be nested. I found an actual example of this while typechecking django.http.response.HttpResponse.content

I have come across this while trying to define a generic JSON type:

JSON = Union[Dict[str, "JSON"], List["JSON"], str, int, float, bool, None]

So consider this a +1 for supporting this use case.

@srittau JSON needs to be Any because it is recursive and you can give json.loads a custom JSONEncoder class:

_PlainJSON = Union[Dict[str, "_PlainJSON"], List["_PlainJSON"], str, int, float, bool, None]
_T = TypeVar('_T')
JSON = Union[_PlainJSON, _T, Dict[str, "JSON"], List["JSON"]]
def loads(data: str, cls: Type[JSONEncoder[_T]]) -> JSON: ...

of course recursive types and Type[JSONEncoder[_T]] types arn't supported.

The following pattern seems to be good enough for my purposes. The boilerplate is tolerable for me.

class BTree(NamedTuple):
    val: int
    left_ : Any
    right_ : Any

    # typed constructor
    @staticmethod
    def c(val: int, left: Optional['BTree'] = None, right: Optional['BTree'] = None) -> 'BTree':
        return BTree(val, left, right)

    # typed accessors
    @property
    def left(self) -> Optional['BTree']:
        return cast(Optional[BTree], self.left_)
    @property
    def right(self) -> Optional['BTree']:
        return cast(Optional[BTree], self.right_)

atree = BTree.c(1, BTree.c(2, BTree.c(3), BTree.c(4)), BTree.c(5))
atree2 = BTree.c(1, BTree.c(2, BTree.c(3), BTree.c(4)), BTree.c(5))
assert atree == atree2
assert isinstance(atree,BTree) and isinstance(atree.left,BTree) and isinstance(atree.left.left,BTree)

Latest version of pattern. We use this example at Legalese for interacting with SMT solvers (the SMTLIB language).

I found that I ended up forgetting to use the typed .c static method instead of the untyped constructor in the BTree example above. This version addresses that. It's only very minor deficits are:

  1. Boilerplate (which I don't care about if the datatype is significant enough for us to care about the difference between an immutable recursive datatype and Tuple[Any,...])
  2. The constructor SMTExprNonatom and the type SMTExprNonatom_ do not share the same name. But this is only aesthetic; You won't use one when you mean the other, since with this naming convention, SMTExprNonatom will come up first in autocomplete, and only SMTExprNonatom_ can be used in a type position.
# Immutable recursive datatypes pattern
SMTAtom = Union[str, int, float, bool]
SMTExpr = Union['SMTExprNonatom_',SMTAtom]
class SMTExprNonatom_(NamedTuple):  
    symb: str
    args_: Tuple[Any,...] # see `args` below
    @staticmethod  # typed constructor, which we alias to `SMTExprNonatom` after the class def
    def c(symb:str, args:Iterable[SMTExpr]) -> 'SMTExprNonatom_': return SMTExprNonatom_(symb, tuple(args))
    @property # typed accessor
    def args(self) -> Tuple[SMTExpr]: return cast(Tuple[SMTExpr], self.args_)
SMTExprNonatom = SMTExprNonatom_.c
SMTCommand = NewType('SMTCommand', SMTExprNonatom_)

Updating to normal priority since this comes up frequently and the implementation got easier with some recent changes.

@JukkaL, anything I could do to help?

@DustinWehr We are not ready to start implementing this yet -- it's still a pretty complex feature, and we have other high-priority work scheduled for the near future. Finding additional real-world use cases where recursive types could be used would be helpful, though. JSON is the canonical use case, but beyond that we don't have many examples.

An example would be describing a typed AST, as those are usually recursive

We use a library called asynq and it has a recursive type for futures.

@JukkaL in case you are skeptical that people (besides the teams speaking up in this thread) are using python for typical FP use cases, a couple serious example projects of that type are:
http://microsoft.github.io/ivy/language.html
https://en.wikipedia.org/wiki/SageMath

Just a dump of a recent discussion with @JukkaL:

  • We will use Β΅-type representation of recursive types, i.e. TypeAlias node + TypeAliasInstance type.
  • We can support generic recursive aliases like A = Tuple[T, A[T]], and then A[int].
  • Β΅-type approach solves the serialization problem, i.e. the same cross_ref dance can be repeated only for TypeAlias vs TypeAliasInstance as for TypeInfo vs Instance.
  • Subtyping, meets, joins, etc. will be calculated essentially as for protocols with a single covariant member. We will have separate assumption stacks at the top of corresponding functions (to not mix with protocols).
  • Calling s = expand_once(s), t = expand_once(t) at the top of type-expecting functions will save many isinstance() checks for branches for different type kinds.
  • However, mypy has a lot of ad-hoc subtyping checks that needs to be updated in addition, we could use either an lgtm query or a simple mypy visitor to find all relevant isinstance() calls.
  • Refactoring that will introduce TypeAlias symbol node (for unrelated benefits) is underway, PR will be up shortly.
  • Realistic time for implementation is 2-3 weeks.
  • Potential target schedule (taking into account other tasks and priorities) is January-February 2019.

Just stumbled over this from the JSON example perspective... Is there any timeline/roadmap for recursive types already?

Potential target schedule (taking into account other tasks and priorities) is January-February 2019.

We have a workaround in our codebase:

JSON_PRIMITIVE = Union[None, bool, str, int, float]
if TYPE_CHECKING:
    # To avoid hitting an Any in mypy because of the recursive type we add a
    # couple of layers of non-recursive types.
    JSON_BASE = Union[  # type: ignore # Recursive type
        JSON_PRIMITIVE, Dict[str, "JSON_BASE"], List["JSON_BASE"]
    ]
    JSON2 = Union[  # type: ignore # Recursive type
        JSON_PRIMITIVE, Dict[str, JSON_BASE], List[JSON_BASE]
    ]
    JSON1 = Union[  # type: ignore # Recursive type
        JSON_PRIMITIVE, Dict[str, JSON2], List[JSON2]
    ]
    JSON = Union[  # type: ignore # Recursive type
        JSON_PRIMITIVE, Dict[str, JSON1], List[JSON1]
    ]
else:
   JSON = Union[JSON_PRIMITIVE, Dict[str, "JSON"], List["JSON"]]

This gives us a couple of layers of Dict and Lists before we get to Any. However we've learned that using JSON is exceedingly painful because of that Union You have to either isinstance very heavily or cast to get around issues. Dict[str, Any] is sadly the name of the game most of the time.

I ended up going with this in a previous project - you can construct a json parser given a typing.NamedTuple class that matches the json structure

https://gist.github.com/Daenyth/37d615e502114009d6a33652a814a7c8

Speaking of workarounds for JSON, here is a pretty solid workaround using protocols. @ilevkivskyi, @JukkaL, do you see any potential gotchas beyond mimicking the required behavior of list and dict (which could be improved by including more functions)? Also, I'm not sure about performance.

from typing import Any, Dict, Union, Sequence, overload
from typing_extensions import Protocol


class _JSONArray(Protocol):
    def __getitem__(self, idx: int) -> 'JSONLike': ...
    # hack to enforce an actual list
    def sort(self) -> None: ...


class _JSONDict(Protocol):
    def __getitem__(self, key: str) -> 'JSONLike': ...
    # hack to enforce an actual dict
    @staticmethod
    @overload
    def fromkeys(seq: Sequence[Any]) -> Dict[Any, Any]: ...
    @staticmethod
    @overload
    def fromkeys(seq: Sequence[Any], value: Any) -> Dict[Any, Any]: ...


JSONLike = Union[str, int, float, bool, None, _JSONArray, _JSONDict]


obj: JSONLike = {"1": 1}  # ok
obj2: JSONLike = {1: 1}  # fail
obj3: JSONLike = [1, 2, {3: [5, 6]}]  # fail
obj4: JSONLike = {"s": [None]}  # ok
obj5: JSONLike = [[b'a']]  # fail

Hm, ok, the limitation of this approach obviously being that it's "one-directional", one would still have to cast JSONArray/Dict to List/Dict when getting out of JSON.

@DustinWehr ...Finding additional real-world use cases where recursive types could be used would be helpful, though. JSON is the canonical use case, but beyond that we don't have many examples.

@JukkaL I suppose it's a pretty similar problem to that of JSON, but I've been trying to add type hints to apache/avro. Without recursive types I can't express the shape of avro schema completely.

Here is the work in progress. Variance is hard and mypy has found a lot of problems, so I expect to be working on this for a while.

pstch commented

About real-world use cases where recursive types could be useful, they are useful when writing programs that deal with explicitly recursive data structures (not just JSON), which encompasses a large class of programs. Such programs may be written and properly specified without recursive data types, but they can be needed to provide a complete type specification of some implementations.

Another real world use case. I'm writing a function for a CMS which builds a page tree from a value like this:

[
    (
        Page(title='Page 1'),
        [
            (Page(title='Page 2'), ...),
            (Page(title='Page 2'), ...),
        ]
    ),
]

where ... are nested iterables of the same structure, representing the child pages. The type would be something like

PageTree = Iterable[Tuple[Page, 'PageTree']]

Potential target schedule (taking into account other tasks and priorities) is January-February 2019.

I'm not sure if there is any update on the timeline for this, but if it won't be soon, would it be possible to add a config option to suppress the 'Recursive types not fully supported yet, nested types replaced with "Any"' error message?

I'm okay with my project having Anys for now, and it'd be nice to be able to write the recursive types now and not need to go back and replace them once recursive types are implemented.

I'm not sure if there is any update on the timeline for this

I think we are still on time (most likely late February).

Is there a place to watch or contribute to progress on this?

@tedkornish #6204 and any other commits into newsemnal

@JukkaL it looks like newsemnal landed, did it include support for recursive types?

It looks like we'll need to wait a bit more until we add recursive type support. I can't give a new estimate yet.

Hm, ok, the limitation of this approach obviously being that it's "one-directional", one would still have to cast JSONArray/Dict to List/Dict when getting out of JSON.

A work-around proposal that does not require casting to List/Dict:

from typing import Optional, Union, Iterable, Mapping, List, Dict


Primitive = Union[str, int, bool]
JsonType = Optional[Union[Primitive, "JsonList", "JsonDict"]]


class JsonList(list):
    """
        List type containing MyType elements.
    """
    def __init__(self, iterable: Iterable[JsonType]) -> None:
        super().__init__(iterable)


class JsonDict(dict):
    """
        Dict type with str keys and JsonType values.
    """
    def __init__(self, mapping: Optional[Mapping[str, JsonType]] = None, **kwargs: JsonType) -> None:
        if mapping is None:
            super().__init__(**kwargs)
        else:
            super().__init__(mapping, **kwargs)


lst: JsonList = JsonList([
    1,
    JsonList([
        2,
        JsonList([
            3,
            JsonList([
                4,
                JsonList([
                    5,
                    JsonDict(
                        recursion=True
                    ),
                    True,
                    None
                ])
            ])
        ])
    ])
])
dct: Dict[str, JsonType] = JsonDict(
    name="myName",
    lst=lst,
    integer=12,
    none=None
)

dct_lst: JsonType = dct.get("lst")
dct_integer: JsonType = dct["integer"]
generic_list: List[JsonType] = lst
lst_item: JsonType = lst[0]
generic_list_item: JsonType = generic_list[0]

Of course, the Primitive type should be expanded to include all desired primitives.

@sanderr your code is very useful, i've modified it into a strict version that doesn't use Any and passes the strictest configuration https://gist.github.com/catb0t/bd82f7815b7e95b5dd3c3ad294f3cbbf

@catb0t I opted for the generic list and dict types over List[JsonType] and Dict[str, JsonType] because one of my vim plugins complained: no-member: Instance of 'JsonDict' has no 'get' member and similar. I assumed this complaint came from mypy, but it appears to be something else, so it seems like your modification is indeed better.
Two observations:

  • if you inherit from the specific list and dict types as you do, I believe it is not even necessary to override __init__.
  • if you do decide to override __init__, be aware that I wrote above more as a proof of concept than as a stable drop-in replacement for list and dict types. I believe the dict constructor in particular is a bit more complex than what I wrote, so it might be worth looking into that. For example, the code snippet below shows a valid construction of a dict, that isn't typed correctly by my implementation:
dict([("key", "value"), ("other_key", "other_value")])
JsonDict([("key", "value"), ("other_key", "other_value")])
mitar commented

@ilevkivskyi Does this also work now with the new analyzer?

Not yet, @ilevkivskyi is actually currently working on adding support for recursive types. He will know the exact schedule better but it looks like mypy will support recursive types in the fairly near future.

ir4y commented

Are there any updates @ilevkivskyi ?
Is there any help needed?

@ir4y There are only sad updates: I am no longer working on this, and likely will not work on this in foreseeable future. You can try pushing it yourself, but it is not an easy task, and I can't guide you, sorry (others may still have time for this).

ir4y commented

@ilevkivskyi

it is not an easy task

Sure it is. Otherwise, it will be already done.

I will try to add support for recursive types.

I can't guide you

It is sad.

Could you at least provide some details that may be helpful?
What is a good start point for this task?
Do you have any WIP pull-request?
What common issues I'll face?

There are a couple of other type-checkers for Python out there...

Facebook released pyre-check. I don't know if it supports recursive types because I get cryptic parse errors when trying to run it on my project.

Google's pytype seems to handle recursive types ok, I can run it on my project with the # type: ignore comments removed from the recursive types and it's all fine.
Less anecdotally, I think one of the devs commented here that it is supported:
https://lobste.rs/s/0uv5hy/how_quickly_find_type_issues_your_python#c_df7acb

...maybe something can be learned from the pytype codebase to bring this feature into mypy?

ir4y commented

@anentropic
There is also pyrigth by Microsoft.

Here's a very short summary of recursive types could be implemented (on top of the foundation that @ilevkivskyi has already implemented).

A simple self-recursive type would use TypeAlias that has a TypeAliasType that contains the recursive reference. TypeAliasType instances can be expanded an arbitrary number of times (the expansion for recursive types is infinite, so we'd only expand as much as is needed). The tricky bit is that type operations such as subtype checks would need to limit recursion when there are type alias types, similar to how we deal with recursive protocols right now. (see _assuming etc. in TypeState and how it's used).

When I have more time I can write a more detailed description and/or provide links to external resources.

To start with, I'd recommend trying to support some basic operations for a simple recursive type such as X = Union[int, List['X'].

We can disable support for recursive types initially while the implementation is incomplete, and provide a command-line option to enable them. This way the feature can be implemented incrementally, and PRs can be merged to master even if things are not quite ready. Even if whoever starts to work on this can't finish the implementation, somebody else can later continue the work, without impacting mypy users.

This is the most thumbed-up open issue by a large margin

@Mattwmaster58 unfortunately it is also rather complicated, as it requires a lot of changes to mypy. Ivan has done a lot of work on this, but there is still more to do, and he isn't working on it anymore.

We understand there is great interest and need for recursive types, but open source works at the speed of people's spare time and interest.

Is there anything that newcomers to the project might be able to realistically help with this?

For anyone else trying to do recursive Callables. A callback protocol seems to work fine:

class Foo(Protocol):
    def __call__(self) -> Awaitable['Foo']: ...

or similar.

I would also like to express interest in this feature.
In particular:

NestedStrDict = Dict[str, Union[NestedStrDict, int]]

For objects such as:

obj = {
    'a' : 0,
    'nested_dict' : 
    { 
        'b' : 1,
        'c' : 2,
        'nested2_dict' : 
        {
             'd' : 3
        }
    }
}
sg495 commented

Edit: I did not see the comment #731 (comment) by @jhrmnn when first reading through this issue. I guess I am proposing the same thing and wondering exactly how much work remains to be done before we get a protocol-based solution to (some) recursive types, like the one described in that/this comment. (I should disclaim that I am an avid user of Mypy but not sufficiently acquainted with the nitty-gritty of its implementation. That said: if it is realistic for me to help with this issue, I'm certainly happy to help.)

Original Comment

It seems that some additional support for list literals and dictionary literals might make it possible to implement a recursive JSON-like type using protocols, as follows:

from typing import Iterator, Protocol, runtime_checkable, Union
from typing import KeysView, ValuesView, ItemsView

@runtime_checkable
class JSONSequence(Protocol):
    """
        Protocol for a sequence of JSON values.
    """

    def __getitem__(self, idx: int) -> "JSONLike":
        ...

    def __contains__(self, value: "JSONLike") -> bool:
        ...

    def __len__(self) -> int:
        ...

    def __iter__(self) -> Iterator["JSONLike"]:
        ...

    def __reversed__(self) -> Iterator["JSONLike"]:
        ...

@runtime_checkable
class JSONMapping(Protocol):
    """
        Protocol for a mapping of strings to JSON values.
    """

    def __getitem__(self, key: str) -> "JSONLike":
        ...

    def __contains__(self, key: str) -> bool:
        ...

    def __len__(self) -> int:
        ...

    def __iter__(self) -> Iterator[str]:
        ...

    def keys(self) -> KeysView[str]:
        ...

    def values(self) -> ValuesView["JSONLike"]:
        ...

    def items(self) -> ItemsView[str, "JSONLike"]:
        ...

JSONLike = Union[None, bool, int, str, JSONSequence, JSONMapping]
""" Recursive JSON type. """

Indeed, it seems that Mypy (v 0.782) already recognises as correct some of the interesting combinations, and only requires little nudging in other cases.

# ... Continued from before ...
from typing import cast

x: JSONLike = None                                  # OK
x = True                                            # OK
x = 42                                              # OK
x = "hello"                                         # OK
x = []                                              # OK
x = [None, True, 43, "hello"]                       # OK
x = {"the final answer": 42}                        # OK
x = {"burn baby burn": "master ignition routine"}   # OK
x = {
    "the final answer": cast(JSONLike, 42),
    "burn baby burn": "master ignition routine"
}                                                   # OK
x = {
    "the final answer": cast(JSONLike, 42),
    "burn baby burn": "master ignition routine",
    "a bunch of stuff": {
        "none": None,
        "bool": True,
        "int": 43,
        "str": "hello",
        "list": [None, True, 43, "hello"]
    }
}                                                   # OK

x = {}
# Mypy Error: assignment - Incompatible types in assignment
# expression has type "Dict[<nothing>, <nothing>]",
# variable has type "Union[None, bool, int, str, JSONSequence, JSONMapping]"

x = {
    "the final answer": 42,
    "burn baby burn": "master ignition routine"
}
# Mypy Error: assignment - Incompatible types in assignment
# expression has type "Dict[str, object]",
# variable has type "Union[None, bool, int, str, JSONSequence, JSONMapping]"

One has to be a little careful with runtime typechecking, e.g. because strings and dictionaries match the protocol for JSON sequences (at least as written above), but I'm confident that this could be remedied with little additional work:

# ... Continued from before ...
from collections import deque
from typing import Dict

def to_strict_json(json_like: JSONLike) -> Union[None, bool, int, str, list, dict]:
    """
        Recursively converts `JSONSequence`s to `list`s
        and `JSONMapping`s to `dict`s.
    """
    if isinstance(json_like, str):
        # strings match the `JSONSequence` protocol, must intercept.
        return json_like
    if isinstance(json_like, JSONMapping):
        # dictionaries match the `JSONSequence` protocol, must intercept
        return {k: to_strict_json(v) for k, v in json_like.items()}
    if isinstance(json_like, JSONSequence):
        return [to_strict_json(x) for x in json_like]
    return json_like

class MyReadonlyDict(JSONMapping):
    """ A silly readonly dictionary wrapper. """
    _dict: Dict[str, JSONLike]
    def __init__(self, d):
        super().__init__()
        self._dict = dict(d)
    def __getitem__(self, key: str) -> JSONLike:
        return self._dict[key]
    def __contains__(self, key: str) -> bool:
        return key in self._dict
    def __len__(self) -> int:
        return len(self._dict)
    def __iter__(self) -> Iterator[str]:
        return iter(self._dict)
    def keys(self) -> KeysView[str]:
        return self._dict.keys()
    def values(self) -> ValuesView[JSONLike]:
        return self._dict.values()
    def items(self) -> ItemsView[str, JSONLike]:
        return self._dict.items()

x = deque([None, True, 43, MyReadonlyDict({'a str': ['hello', 'bye bye']})]) # OK
print(to_strict_json(x))
# [None, True, 43, {'a str': ['hello', 'bye bye']}]

PS: I'm just throwing this out there as an instance of protocols already being "close enough" for some recursive types to be written. I'm not saying that this would necessarily be a full solution for a JSON type (cf python/typing#182).

If I remember correctly, # type: ignore used to suppress recursive type errors, but in the current version (mypy==0.790) it doesn't seem to work - I'm seeing a lot of these errors in spite of # type: ignore. Am I missing something?

@antonagestam that issue is closed. Does this mean that there isn't (and will not be) a way to suppress these errors until recursive types are fully implemented?

@altvod I think that is a safe assumption to make.

intgr commented

An option to suppress errors due to recursive types is a very reasonable request, and far easier to implement than full support for recursive types. Especially now that this issue is basically disowned, it's not clear when someone will pick it up again.

Please open a separate issue for the suppresion capability.

I realized that I had in fact a slightly different error: #8695

csenn commented

It's clear this issue is complicated, but just want to provide a TypedDict recursive example that is probably pretty common and failing for anyone searching.

Self Reference:

from typing import TypedDict, List

Comment = TypedDict('Comment', {
  "text": str,
  "childComments": List['Comment']
})

Deep Reference:

from typing import TypedDict, List

Comment = TypedDict('Comment', {
  "text": str,
  "reply": "CommentThread"
})

CommentThread = TypedDict('CommentThread', {
  "likes": int,
  "comments": List[Comment]
})

Both display error: possible cyclic definition

JSON = Union[Dict[str, "JSON"], List["JSON"], str, int, float, bool, None]
This example while classic case plays poorly with the type variance rules in python. You would hope that List[int] is a JSON type but it is not due to variance. Expanding the JSON type for the list case it is,

List[Union[Dict[str, "JSON"], List["JSON"], str, int, float, bool, None]]]

List[int] is not a subtype of List[Union[int, ...]]. You can make covariant json if you are willing to use mapping/sequence but if you want an actual list/dict or intend your json type to be mutable it'll be hard to keep the type variance rules and have useful recursive types involving any invariant generic.

pyright has recursive type support, but actually writing a json type with list/dict and using list with it don't cooperate with the variance rules. I think any solution would require updating the variance rules specifically for recursive types.

edit: My understanding was wrong thinking/discussing it more. The issue goes away if you are fine with list/dict elements being non homogenous (so json).

BvB93 commented

Ever since the most recent mypy release there seems to be some basic support for recursive types, as can been seen in the recursive protocol version of Sequence below:

from __future__ import annotations

from collections.abc import Iterator
from typing import TypeVar, Protocol, overload, Any, TYPE_CHECKING

_T_co = TypeVar("_T_co")

class _RecursiveSequence(Protocol[_T_co]):
    def __len__(self) -> int: ...
    @overload
    def __getitem__(self, __index: int) -> _T_co | _RecursiveSequence[_T_co]: ...
    @overload
    def __getitem__(self, __index: slice) -> _RecursiveSequence[_T_co]: ...
    def __contains__(self, __x: object) -> bool: ...
    def __iter__(self) -> Iterator[_T_co | _RecursiveSequence[_T_co]]: ...
    def __reversed__(self) -> Iterator[_T_co | _RecursiveSequence[_T_co]]: ...
    def count(self, __value: Any) -> int: ...
    def index(self, __value: Any, __start: int = ..., __stop: int = ...) -> int: ...


def func1(a: _RecursiveSequence[int]) -> int: ...

if TYPE_CHECKING:
    reveal_type(func1([1]))         # Revealed type is "builtins.int"
    reveal_type(func1([[1]]))       # Revealed type is "builtins.int"
    reveal_type(func1([[[1]]]))     # Revealed type is "builtins.int"
    reveal_type(func1((1, 2, 3)))   # Revealed type is "builtins.int"
    reveal_type(func1([(1, 2, 3)])) # Revealed type is "builtins.int"
    reveal_type(func1([True]))      # Revealed type is "builtins.int"

The only area where mypy still fails is if typevars are involved. This is a bit of a shame, but the fact that there is now basic support for recursive types is already a big step forward.

_T = TypeVar("_T")

def func2(a: npt._NestedSequence[_T]) -> _T: ...

seq_1d_a: list[int]
seq_1d_b = [1]
seq_2d_a: list[list[int]]
seq_2d_b = [[1]]

if TYPE_CHECKING:
    # The good
    reveal_type(func2(seq_1d_a))  # Revealed type is "builtins.int*"

    # The bad
    reveal_type(func2(seq_1d_b))  # Revealed type is "Any"
    reveal_type(func2(seq_2d_a))  # Argument 1 to "func" has incompatible type "List[List[int]]"; expected "_NestedSequence[<nothing>]"
    reveal_type(func2(seq_2d_b))  # Revealed type is "Any"

Ever since the most recent mypy release there seems to be some basic support for recursive types, as can been seen in the recursive protocol version of Sequence below:

from __future__ import annotations

from collections.abc import Iterator
from typing import TypeVar, Protocol, overload, Any, TYPE_CHECKING

_T_co = TypeVar("_T_co")

class _RecursiveSequence(Protocol[_T_co]):
    def __len__(self) -> int: ...
    @overload
    def __getitem__(self, __index: int) -> _T_co | _RecursiveSequence[_T_co]: ...
    @overload
    def __getitem__(self, __index: slice) -> _RecursiveSequence[_T_co]: ...
    def __contains__(self, __x: object) -> bool: ...
    def __iter__(self) -> Iterator[_T_co | _RecursiveSequence[_T_co]]: ...
    def __reversed__(self) -> Iterator[_T_co | _RecursiveSequence[_T_co]]: ...
    def count(self, __value: Any) -> int: ...
    def index(self, __value: Any, __start: int = ..., __stop: int = ...) -> int: ...


def func1(a: _RecursiveSequence[int]) -> int: ...

if TYPE_CHECKING:
    reveal_type(func1([1]))         # Revealed type is "builtins.int"
    reveal_type(func1([[1]]))       # Revealed type is "builtins.int"
    reveal_type(func1([[[1]]]))     # Revealed type is "builtins.int"
    reveal_type(func1((1, 2, 3)))   # Revealed type is "builtins.int"
    reveal_type(func1([(1, 2, 3)])) # Revealed type is "builtins.int"
    reveal_type(func1([True]))      # Revealed type is "builtins.int"

The only area where mypy still fails is if typevars are involved. This is a bit of a shame, but the fact that there is now basic support for recursive types is already a big step forward.

_T = TypeVar("_T")

def func2(a: npt._NestedSequence[_T]) -> _T: ...

seq_1d_a: list[int]
seq_1d_b = [1]
seq_2d_a: list[list[int]]
seq_2d_b = [[1]]

if TYPE_CHECKING:
    # The good
    reveal_type(func2(seq_1d_a))  # Revealed type is "builtins.int*"

    # The bad
    reveal_type(func2(seq_1d_b))  # Revealed type is "Any"
    reveal_type(func2(seq_2d_a))  # Argument 1 to "func" has incompatible type "List[List[int]]"; expected "_NestedSequence[<nothing>]"
    reveal_type(func2(seq_2d_b))  # Revealed type is "Any"

I saw the module _nested_sequence.py available in numpy, so I tried to copy/paste it into my project, but I can't get it to work for simple use cases:

v: _NestedSequence[int]
v1: _NestedSequence[int]
v2: _NestedSequence[int]
v3: _NestedSequence[int]
v4: _NestedSequence[int]

# Fails as expected: 
# Incompatible types in assignment (expression has type "int", variable has type "_NestedSequence[int]")  [assignment]mypy(error)
v = 1

# Does not fail as expected
v1 = [1]
# Does not fail as expected
v2 = [[1]]

# Does not fail, but I expected a failure
v3 = ["a"]
v4 = [["a"]]

@BvB93 Are you aware of such behavior or is there a problem on my side ?

I'm using Python 3.8.10, with mypy==0.942 and mypy-extensions==0.4.3.

My mypy config is:

[mypy]
exclude = notebooks
scripts_are_modules = True
show_traceback = True
disallow_incomplete_defs = True
check_untyped_defs = True
disallow_untyped_defs = True
disallow_untyped_decorators = True
disallow_any_generics = True
warn_no_return = True
no_implicit_optional = True
strict_optional = True
warn_redundant_casts = True
warn_unused_ignores = True
warn_return_any = True
warn_unreachable = True
show_error_codes = True
show_column_numbers = True
ignore_missing_imports = True
implicit_reexport = True
plugins = pydantic.mypy

Thanks and happy easter !

EDIT: I created a reproducible example

Pyright/Pylance has recursive type definition support. It's good idea to mimic that
https://devblogs.microsoft.com/python/pylance-introduces-five-new-features-that-enable-type-magic-for-python-developers/

I understand now that the problem only occurs when using literal values which are not annotated. Using _NestedSequence as implemented in numpy.typing:

def func1(a: _NestedSequence[int]) -> int: ...


# Fails as expected:
# Incompatible types in assignment (expression has type "int", variable has type "_NestedSequence[int]")  [assignment]mypy(error)
v1 = func1(1)   # type: ignore[arg-type]

# Does not fail as expected
v2 = func1([1])
# Does not fail as expected
v3 = func1([[1]])

# Does not fail
v4 = func1(["a"])  # [arg-type] error was expected
v5 = func1([["a"]])  # [arg-type] error was expected

# Does fail as expected
input_: List[str] = ["a"]
input__: List[List[str]] = [["a"]]
v4_ = func1(input_) # type: ignore[arg-type]
v5 = func1(input__) # type: ignore[arg-type]

As @eugene-bright mentioned, pyright supports recursive type aliases, allowing for simple recursive type definitions like this:

_RecursiveSequence = _T_co | Sequence["_RecursiveSequence[_T_co]"]

This works with all of the examples above.

def func1(a: _RecursiveSequence[_T_co]) -> _T_co:
    ...

reveal_type(func1(1))  # Revealed type is "int"
reveal_type(func1([1]))  # Revealed type is "int"
reveal_type(func1([[1]]))  # Revealed type is "int"
reveal_type(func1([[[1]]]))  # Revealed type is "int"
reveal_type(func1((1, 2, 3)))  # Revealed type is "int"
reveal_type(func1([(1, 2, 3)]))  # Revealed type is "int"
reveal_type(func1([True]))  # Revealed type is "bool"

reveal_type(func1(["a"]))  # Revealed type is "str"
reveal_type(func1([["a"]]))  # Revealed type is "str"

input1: list[str] = ["a"]
input2: list[list[str]] = [["a"]]
reveal_type(func1(input1))  # Revealed type is "str"
reveal_type(func1(input2))  # Revealed type is "str"

Pyre supports recursive type aliases as well. It would be great if mypy added this support so library authors could make use of the feature.

But why are you all talking about pyright? Unless I'm missing something, we're writing comments on the mypy repository ?
Switching to pyright right now for the sole benefice of recursive typing is out of question for me at the moment. Moreover, I'm quite satisfied with the basic recursion features mypy offer.
My question concerns a behavior of mypy which I found strange, (maybe incorrect?) encountered when trying a solution proposed in this thread. Maybe I should have open an issue on numpy repository, but I thought it was more of a mypy issue and did not want to create a duplicate issue...

Eric is the author of pyright, so it's hardly surprising that he talks about it :) I think it's helpful to consider what other type checkers do as we consider how to add recursive type support to mypy.

Sorry if my answer seemed harsh, I can only agree with your statement. I do realize that I was looking for an anwser to my question more than discussing how to implement recursive types, my bad. Do you know where such question should be asked though ?

Thanks, it's done, feel free to remove my comments as they do not contribute much to the discussion.

BvB93 commented

@charbonnierg I don’t have time to look at it right now, but I do recall seeing some oddities with strings before. My guess is that it’s string’s recursive definition (class str(Sequence[str]): …) that eventually causes mypy to cast it to _NestedSequence[Any] in the example you provided above.
This is still a bit surprising though, as both types have incompatible __contains__ definitions, regardless of generic parameter.

Mypy actually supports recursive types in classes very well, it just does not support recursive type aliases:

from typing import *                                                            
T = TypeVar("T")                                                                
                                                                                
class X(list[X[T] | T]): ...                                                    
                                                                                
foo: X[int] = X([X([1])])                                                       
reveal_type(foo)                                                                
reveal_type(foo[0]) 
$ mypy new.py --strict
new.py:7: note: Revealed type is "new.X[builtins.int]"
new.py:8: note: Revealed type is "Union[new.X[builtins.int*], builtins.int*]"

Only posting this, because I haven't seen an example like this before. It might not be very useful, but it's still recursive :)

Only posting this, because I haven't seen an example like this before. It might not be very useful, but it's still recursive :)

Possibly useful example:

Trie = Dict[str, Union["Trie", Literal["$END"]]]

Another realistic example: type annotating JSON's return type

We don’t need more examples of why this is useful. We know. We just need someone to implement it. So I have locked the issue.

I think this can now be closed by #13297 (and follow-up PRs). If there are remaining problems, they can be opened as separate issues.

Note this is only enabled in master with --enable-recursive-aliases (enables recursive type aliases, recursive TypedDicts, and recursive NamedTuples; as mentioned above proper classes and protocols can already be recursive). Note this is still considered an experimental feature, in particular type inference may sometimes fail, if this happens you can try switching to covariant collections, and/or simply give an explicit annotation.

@ilevkivskyi is there a way to set this config option in a pyproject.toml file? I checked the documentation, but couldn't find anything about the new cli flag (perhaps because it's still experimental, and will eventually be enabled by default?)

Yes, it is indeed a temporary flag (for few releases), and in 0.990 it will be on by default, you can try

$ cat pyproject.toml
[tool.mypy]
enable_recursive_aliases = true

This worked for me on Linux.

It appears that v0.990/v0.991 may not have addressed the entirety of the challenge that is recursive/cyclic types: #14219

For those reading in posterity:

  1. mypy==0.981 added --enable-recursive-alias: #13297
  2. mypy==0.990 started a deprecation cycle for --enable-recursive-alias: #13852
  3. mypy==1.7.0 completed the deprecation cycle, removing --enable-recursive-alias: #16346

So with mypy>=1.7, recursive type support is built into mypy βœ