python/mypy

Dataclasses, `__replace__` and LSP violations

Opened this issue · 1 comments

In 3.13, the __replace__ protocol was added in Python, allowing arbitrary classes to define the method so that copy.replace can call it to perform the copy. Along with this feature, dataclasses (and as such, applies to @dataclass_transform() as well) have the __replace__ method created.

Type checkers (at least mypy and pyright) synthesize a __replace__ method (like the do for __init__) depending on the defined fields. A common pattern when using dataclasses is to override a field with a more precise type:

from typing import Literal

from dataclasses import dataclass, field

@dataclass
class Base:
    foo: str
    

@dataclass
class Sub(Base):
    foo: Literal["test"]

While this code is technically unsafe, mypy does not raise any error unless you enable the mutable-override error code. Note that this error code isn't even enabled in strict mode.

However, mypy (if the configured Python version is 3.13 or greater) will emit an error regarding the LSP violation for the synthesized __replace__ method (playground).

I believe this is confusing/annoying for users, because:

  • I expect the vast majority of users to not know about the LSP principle, nor the __replace__ protocol. It is also confusing to see an error poping about this method when it is not defined by the end user.
  • Users might not care about the replace protocol at all.
  • There's no way to ignore the error (or maybe there is but I don't know any way to do so), as no line number is provided in the error (because the __replace__ method is synthesized).

In contrast, while pyright emits an error for the incompatible field override 1 (essentially doing the same thing as mypy if mutable-override were to be enabled), it doesn't raise anything related to the __replace__ LSP violation.

In Pydantic, we got a first report about this issue on Oct 24, about two weeks after the final 3.13 release. We also got a similar report today, this time when using aliases. I've raised this discussion a while ago, and commented on the fact that from a practical perspective, this isn't ideal.

Here are a couple ideas to go forward:

  • mypy does not emit any violation regarding the synthesized __replace__ method.
  • mypy only emits a violation regarding the synthesized __replace__ method if the mutable-override error code is enabled.

I would personally go with option one, especially because while synthesizing a __replace__ method makes sense, it does not provide any benefit. Users should not call this method directly, and instead use copy.replace, which isn't special cased by any type checker as of today (i.e. you don't get any errors if you don't provide the correct arguments to the copy.replace function).

Footnotes

  1. This was debated at length here.

Yeah, I agree with your option 1. PR welcome.

See also #17623 , we may want to ignore __replace__ when inferring variance as well