litestar-org/polyfactory

Bug: SQLAlchemy Mapped field type not used correctly

anthonyjgraff opened this issue · 10 comments

Description

In the below example, you would expect foo.id to be an int

When get_field_value is called, field_meta.type_args has correctly been identified as an int - however, we end up calling the unwrapped annotation directly

This behaves as expected in polyfactory==2.2, provided from __future__ import annotations is not imported - and seems to have broken somewhere between there and polyfactory==2.3

URL to code causing the issue

https://github.com/litestar-org/polyfactory/blob/main/polyfactory/factories/base.py#L673

MCVE

from polyfactory.factories import DataclassFactory
from sqlalchemy.orm import DeclarativeBase, Mapped, MappedAsDataclass, mapped_column


class Base(MappedAsDataclass, DeclarativeBase):
    pass


class Foo(Base):
    __tablename__ = "foo"

    id: Mapped[int] = mapped_column(primary_key=True)


class FooFactory(DataclassFactory[Foo]):
    __model__ = Foo


def test_factory():
    foo = FooFactory.build()
    print(type(foo))
    assert isinstance(foo.id, int)


if __name__ == "__main__":
    test_factory()

Steps to reproduce

No response

Screenshots

No response

Logs

<class 'sqlalchemy.orm.base.Mapped'>
Traceback (most recent call last):
  File "/app/tests/test_example.py", line 26, in <module>
    test_factory()
  File "/app/tests/test_example.py", line 22, in test_factory
    assert isinstance(foo.id, int)
AssertionError

Release Version

Python 3.11.4

polyfactory==2.8.0
SQLAlchemy==2.0.16

Platform

  • Linux
  • Mac
  • Windows
  • Other (Please specify in the description above)

Funding

  • If you would like to see an issue prioritized, make a pledge towards it!
  • We receive the pledge once the issue is completed & verified
Fund with Polar
guacs commented

I don't think this is a bug. The dataclass factory is not meant to be used to create instances of SQLA models. There is a SQLA factory that is being worked on, but until that is completed, I think you'll have to create a custom factory that can properly handle the Mapped types as well. If you need any help with that, I'd be happy to help.

My understanding was that the MappedAsDataclass class means that the SQLAlchemy model is converted into a Python dataclass - which means it should theoretically be compatible with DataclassFactory (even if there are other limitations)

As mentioned above, the above example actually works with polyfactory==2.2 - but I'm guessing some of the improvements made to add support for Pydantic 2.X changed how types are worked out

Totally understand this may be an issue with how Mapped has been implemented, rather than an issue with how polyfactory works out types - but if it is a polyfactory issue, would be happy to contribute a fix if anyone can offer a little guidance!

guacs commented

Aah I completely missed the MappedAsDataclass! Yeah in this case, this is absolutely a bug. Taking a quick look into it, I think the issue is in this line:

model_type_hints = get_type_hints(cls.__model__, include_extras=True)

I think the fix would be to use the API provided in the dataclasses module to get the types instead of the get_type_hints. A PR for this would definitely be welcome :)

guacs commented

The fix for this was reverted in #383 since forward references are not being properly resolved. @anthonyjgraff, I think as a workaround, you'll have to override the get_model_fields on your factory or create a custom base factory (subclassed from DataclassFactory) that can properly handle MappedAsDataclass as well.

Sorry about that!

Think for my specific use case, I may just be able to switch over to the new SQLAlchemy factory - that makes use of some SQLAlchemy specific attributes to resolve fields in the way I'd expect - so feel free to close this problem.

May take another look at this though, as this may help support the dataclass transform decorator more generally (https://docs.python.org/3/library/typing.html#typing.dataclass_transform)

guacs commented

@anthonyjgraff, don't worry about it. It's more due to the implementation of the fields API rather than your implementation. Actually, if we add the cls.__model__.__module__ to the module parameter (I think that's the name), it does work. There's only one case that it doesn't work which is when the class that is forward annotated (by keeping it under if TYPE_CHECKING) that fails since we don't know the module where that class was defined in.

guacs commented

I'm gonna keep this open since I do think we should support this and other cases like this that may exist where classes behave like a dataclass even if they're not actually a dataclass.

How is this supposed to work with a non-init column? (init=False)

overriding should_set_field_value seems to work for now....

from typing import Any
from polyfactory.factories.sqlalchemy_factory import SQLAlchemyFactory, FieldMeta
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import MappedAsDataclass


class Base(MappedAsDataclass, DeclarativeBase):
    """subclasses will be converted to dataclasses"""


class User(Base):
    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(primary_key=True, init=False)
    name: Mapped[str]


class UserFactory(SQLAlchemyFactory[User]):
    __model__ = User

    @classmethod
    def should_set_field_value(cls, field_meta: FieldMeta, **kwargs: Any) -> bool:
        def is_init() -> bool:
            if not getattr(cls.__model__, '__dataclass_fields__'):
                return True
            return cls.__model__.__dataclass_fields__[field_meta.name].init
        return not field_meta.name.startswith("_") and field_meta.name not in kwargs and is_init()

def test_factory():
    user = UserFactory.build()
    print(type(user))
    assert isinstance(user.name, str)


if __name__ == "__main__":
    test_factory()

This brings up a larger issue though because the DB is supposed to be setting the init=False vals. Maybe I'm doing this completely wrong. Any help would be appreciated.

guacs commented

How is this supposed to work with a non-init column? (init=False)

Giving the init and using the SQLAlchemyFactory won't work since it doesn't consider it as a dataclass, but as a SQL Alchemy model. So, the factory doesn't know whether to include this in init or not. Until this issue is resolved, I think you're going to have to use what you've done as a workaround.

Yeah, I think this is a bug since this should be ignored by default without you having to override the should_set_field_value.