python-attrs/cattrs

TYPE_CHECKING and init=False

AdrianSosic opened this issue · 7 comments

  • cattrs version: 23.2.3
  • Python version: 3.9.18
  • Operating System: macOS

Description

Hi, I think I'm facing a similar situation as described in #160 but with a few subtleties that potentially motivate refining the internal type resolution of cattrs.

What I Did

Consider the following two files:

# inner.py
# --------

from attrs import define


@define
class Inner:
    pass
# outer.py
# --------

from typing import TYPE_CHECKING, Optional

import cattrs
from attrs import define, field

if TYPE_CHECKING:
    from inner import Inner


@define
class Outer:
    inner: Optional[Inner] = field(init=False, default=None)


outer = Outer()
cattrs.unstructure(outer)

Running the latter gives NameError: name 'Inner' is not defined because the inner class is not available in outer.py.

While one could fix this by informing cattrs about the class before attempting to unstructure, I don't think it's a convenient solution in my case because I don't even see a reason why it should be necessary in the first place: the corresponding attribute is init=False and with the default settings where such fields are ignored, cattrs shouldn't even need to bother about the field at all ... assuming of course that I do not overlook some other reason.

The motivation for my example: in my case, Inner relies on some heavy external dependencies (such as torch) and many execution paths of my code don't actually require loading it, so I instead use lazy-loading for improved startup time. That said, I wouldn't want to load it just to satisfy cattrs to perform an unstructuring operation that actually does not require the class due to init=False.

For now, I can simply remove that type annotation, which solves the problem, but this is of course suboptimal. Any thoughts would be highly appreciated.

I assume you're using from __future__ import annotations?

cattrs calls attrs.resolve_types on every class it sees, which calls into typing.get_type_hints which explodes. I don't think there's an easy way for us to be selective about which fields we want resolved, unfortunately. cattrs only wants init=True fields in this case, but I don't think there's an easy way to communicate that to attrs.

Hi @Tinche, as always: thanks for connecting so quickly 🙃 🥇

I assume you're using from __future__ import annotations?

Yes, absolutely. Somehow got lost when copy-pasting.

cattrs calls attrs.resolve_types on every class it sees, which calls into typing.get_type_hints which explodes. I don't think there's an easy way for us to be selective about which fields we want resolved, unfortunately. cattrs only wants init=True fields in this case, but I don't think there's an easy way to communicate that to attrs.

The sequence of calls sounds logical and is what I would have expected to happen in the back. However – and perhaps this is a stupid question because I don't know the cattrs internals in depth – why is this an attrs issue? In my naive opinion: couldn't this problem be avoided by "simply" calling attrs.resolve_types only on those types that actually need to be resolved? That is, for any class in a given code, it's either "class needs to be serialized -> cattrs/converter needs to bother" or "class needs no serialization, e.g. because it only appears with init=False -> cattrs/converter can ignore it". So if resolve_types was only called on the former, the problem wouldn't arise, right?

Here's the actual piece of code that calls attrs.resolve_types:

if any(isinstance(a.type, str) for a in attrs):
. So we don't do it always, we only do it if there is a string annotation in one of the fields.

To help with clarity, let's distinguish between classes and fields. We call attrs.resolve_types on classes, which resolves their fields so we can use them.

So in this case, cattrs will call attrs.resolve_types(Outer) because that will ensure the following call of attrs.fields(Outer) will return the proper metadata (that's just how attrs introspection works in general).

Now, we could make this condition a little more sophisticated and account for init=False fields, but in your example, presumably Outer would have some other attributes you would want serialized. Those annotations would need to be de-stringified to work. And we can't call resolve_types and tell it to just resolve init fields.

Ah ok, now I get it, thanks for the clarification 👍🏼 So it is an attrs issue, indeed ... Given that cattrs is the companion package of attrs, do you think there is a chance to tackle it from that side to enable such use cases for cattrs? After all, I've understood that both packages deeply care about performance and this really is a performance-related issue ;)

Just had a quick look at the source code of resolve_types and noticed that it takes an optional attribs parameter that in fact let's control which fields to loop over:
https://github.com/python-attrs/attrs/blob/5ce5d8278f162ec7542a251091427fd88e538554/src/attr/_funcs.py#L463C9-L463C66
In principle, that would offer an easy solution, no? So instead of calling resolve_types with the class only, one could filter down to init=True attributes only and thus solve it from the cattrs side 🤔

Before we give up on this issue, quickly pulling in @hynek with the hope he could share his opinion on the attrs side of things.

@hynek, quick summary for you, so you don't need to go through the entire chain of messages. The described problem is roughly:

  • We have some attrs fields that are not relevant for serialization via cattrs (e.g. they are init=False / should not be part of the serialized object)
  • The types of these fields are defined only in if TYPE_CHECKING because they belong to some heavy third-party dependency that we import lazily only when really needed at runtime
  • As far as serialization is concerned, cattrs should be happy because it doesn't need to bother about these attributes at all.
  • However, because it needs to call attrs.resolve_types on the class, we run into a problem, since the types are not (yet) available at the time of the call

Can we somehow circumvent the problem? My current (unsatisfactory workaround) is that I omit the types from the code and only have them in a comment for me as a reminder, losing the ability to fully type-check my classes.

Heh you’re asking the guy who asks @Tinche whenever he runs into a tricky typing problem. I’m afraid his replies are as good as it gets.