python/mypy

Recognizing wrapped classes as supporting a Protocol

a-recknagel opened this issue ยท 23 comments

from dataclasses import dataclass
from typing import Dict

from typing_extensions import Protocol

@dataclass
class A:
    pass

class IsDataclass(Protocol):
    __dataclass_fields__: Dict

def dataclass_only(x: IsDataclass):
    pass

dataclass_only(A())

When running mypy on this code, I'd expect no warning, but I get

test.py:16: error: Argument 1 to "dataclass_only" has incompatible type "A"; expected "IsDataclass".

While checking for __dataclass_fields__ might not look like the best thing to go for, it's what dataclasses.is_dataclass checks for as well, so this seems like the most reliable condition right now.

I'd assume it happens because the dataclass-wrapper adds this attribute, and mypy only checks the class definition itself. Can this be fixed?

This can be supported by just adding __dataclass_fields__ Var to the symbol table in the plugin. This is quite exotic use case, so setting priority to low. On the other hand, if you have time, PRs are welcome, this shouldn't be hard to fix.

I'm not 100% sure what you mean with adding the symbol to the plugin table, do you mean this?

@dataclass
class A:
    __dataclass_fields__: Dict

Because that does fix it without side effects, since the dataclass wrapper just overrides it. Then again, once I start adding fields to my dataclasses only to fulfill a protocol I might as well inherit from a classic no-op interface class.

Low prio is what I think as well, I'll look if I can manage to write a proper fix.

I'm not 100% sure what you mean with adding the symbol to the plugin table, do you mean this?

You misread what I wrote, I meant fixing this in mypy plugin for dataclasses (dataclasses in mypy are supported by a bundled/builtin plugin).

I see, thanks for the info =)

So we are talking about mypy/plugins/dataclasses.py here, right @ilevkivskyi ?

I would like to have that fixed because I see no other reliable way of type hinting data classes.
Have never touched mypy source code, but I can give it a try.

So we are talking about mypy/plugins/dataclasses.py here

Yes.

Have never touched mypy source code, but I can give it a try.

Please go ahead! If you feel stuck, feel free to ask here or on Gitter.

I will certainly have some questions, thanks :)

For now, in order to leave it commented here, I managed to track the type checking to the subtypes submodule. Today, I am simply trying to get the gist of what is mypy's flow.

The function call that isn't returning what we expect in order to validate the protocol is at subtypes.py(481):

subtype = find_member(member, left, left)

Where member is the attribute we are checking (i.e.,__dataclass_fields__), and left is A class.

find_member consists of:

	    def get_method(self, name: str) -> Optional[FuncBase]:
	        for cls in self.mro:
	            if name in cls.names:
	                node = cls.names[name].node
	                if isinstance(node, FuncBase):
	                    return node
	                else:
	                    return None

Where:

self = <TypeInfo foo.A>
self.mro = [<TypeInfo foo.A>, <TypeInfo builtins.object>]
name = '__dataclass_fields__'

Given self.names is an empty SymbolTable, name is never in cls.names. Tweaking with dataclasses' symbol table should make __dataclass_fields__ visible here, then making A compliant with the protocol IsDataclass.

Thoughts here so far?

Tweaking with dataclasses' symbol table should make __dataclass_fields__ visible here, then making A compliant with the protocol IsDataclass.

Yes, we already add __init__ and other things to the symbol table in plugins/dataclasses.py, just add a Var there with a type like Dict[str, Any].

It seems I was following the wrong thread there ๐Ÿ˜… Ended up at the semanal.analyze_class class, which analyzes the definitions (defn.defs.body). Still, that was very instructive :)

Gonna go for the plugin now, thanks

How does Var relate to TypeInfo? Creating a Var via Var(random_name, Dict[Any, Any] initializes it with VAR_NO_INFO, for example.
I see it is assigned at some places, like semanal.make_name_lvalue_var

How does Var relate to TypeInfo?

For instance/class attributes, Var.info should be set to the class where it was defined (can be get as ClassDefContext.cls.info).

That makes sense, thanks!

Aren't we supposed to have some sort of Dict node-type at mypy/types.py? I see TypedDict there, which doesn't seem to be the right fit here

Aren't we supposed to have some sort of Dict node-type at mypy/types.py?

No, just use ctx.api.named_type as in several other places in dataclasses.py plugin.

I couldn't tell if there was any progress on this so I attempted solution. It seems to work, but I'm sure I'm missing some things.

My solution was to add the following to the DataclassTransformer.transform method

dataclass_fields_var = Var('__dataclass_fields__')
dataclass_fields_var.info = ctx.cls.info
info.names['__dataclass_fields__'] = SymbolTableNode(MDEF, dataclass_fields_var)

I tried to give dataclass_fields a type but couldn't get the syntax of ctx.api.named_type() correct to set it to Dict[str, Any]

Any / all comments are appreciated.

Sorry for the lack of update, got caught up in a series of other stuff at the second semester of 2019.
Even though I debugged and inspected some of the mypy code, I could not manage to figure out the right way to approach it nor actually understand what is the mental model used on it.

@ilevkivskyi is what @amosson has written right? Any comments?

I would like the help fix this, but unsure how to proceed.

@amosson can you create a PR or fork? and we will work on this.

This sounds reasonable. Note there is a draft PR #8578, but I don't know if the author is still interested.

I have been waiting on a fix for this for a while. Is there anything we can do to get #8578 moving again?

I would like to see this resolved as well. I left some comments addressing the outstanding change requests on #8578, hopefully this would unblock the author. Though it looks like he will need to address some merge conflicts as well.

If the author is unresponsive, is it good practice to create a duplicate PR with his changes?

tkukushkin updated the branch and fixed the failing tests last week. Its just waiting on a review from @JelleZijlstra.

Hi! It seems to have broken in the version 0.991

I think you'll need to specify it as a ClassVar

It works. Thank you!

class DataClassProtocol(Protocol):
    __dataclass_fields__: ClassVar[dict]