Dataclasses containing variables that reference themselves in list/dict fail to re-create as original type
Closed this issue ยท 13 comments
Description
When a dataclass contains a reference to a list/dict of itself in a variable type, converting an object to dict and from dict back to json results in a dictionary in the inner self-typed field instead of the self type.
Code snippet that reproduces the issue
from dataclasses import dataclass
from dataclasses_json import dataclass_json
@dataclass_json
@dataclass
class SpecialLinkedList:
val: ...
nexts: list['SpecialLinkedList'] = None
my_list = SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2)])
print(my_list == SpecialLinkedList.from_dict(my_list.to_dict())) # False
Describe the results you expected
The code snippet above outputs False.
values:
SpecialLinkedList.from_dict(my_list.to_dict()) == SpecialLinkedList(val=1, nexts={'1': {'val': 2, 'nexts': None}})
my_list == SpecialLinkedList(val=1, nexts={'1': SpecialLinkedList(val=2, nexts=None)})
Python version you are using
3.10
Environment description
clean project, only dataclass and dataclasses-json.
When I do
SpecialLinkedList.from_dict(my_list.to_dict())
/lib/python3.11/site-packages/dataclasses_json/core.py:184: RuntimeWarning: `NoneType` object value of non-optional type nexts detected when decoding SpecialLinkedList.
warnings.warn(
SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2, nexts=None)])
Which is correct? In your snippet, you are comparing class references, which will always output false since those are different instances.
Which is correct? In your snippet, you are comparing class references, which will always output false since those are different instances.
Yeah, of course. In my example its more of a pseudo code comparison. If you look at the result value of SpecialLinkedList.from_dict(my_list.to_dict())
you'll see that the nexts attribute points to a dict
instead of a SpecialLinkedList
as expected and as type hinted in the SpecialLinkedList
class.
Thats the problem.
the inner self reference is not being converted to self, rather it remains a dict.
On a second look at your output it seems the bug is not reproducing, do you actually get nexts as a list of SpecialLinkedList
s?
Just tried it again and the bug still replicates for me.
EDIT
I just created another clean environment to test this out.
for reference I'm using python3.10
This if my pip3 freeze output:
dataclasses-json==0.5.14
marshmallow==3.20.1
mypy-extensions==1.0.0
packaging==23.1
typing-inspect==0.9.0
typing_extensions==4.7.1
The bug replicates
Interesting! I tested on 3.11. I will re-test using your env as described and circle back here.
Interesting! I tested on 3.11. I will re-test using your env as described and circle back here.
Hey! ๐
any updates?
hi @NiroHaim
Sorry we have a bit of backlog, but I have this on my list and will look into it hopefully this week, worst case next week :)
Hey! Any news on this?
hi @NiroHaimo not yet, but it is on the todo list. Sorry to keep you waiting, but all team members are a bit swamped past 2 months with both internal and OSS contributions, plus our ability to release to PyPI is severly impaired until Github fixes env protection in October. My current expectation is I'll be able to send PR/identify the issue, but the fix will see actual release around October :(
So confirmed on 3.10 behaviour is different:
from dataclasses import dataclass
from dataclasses_json import dataclass_json
@dataclass_json
@dataclass
class SpecialLinkedList:
val: int
nexts: list['SpecialLinkedList'] = None
my_list = SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2)])
print(my_list)
# SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2, nexts=None)])
sys.version_info
# sys.version_info(major=3, minor=10, micro=12, releaselevel='final', serial=0)
SpecialLinkedList.from_dict(my_list.to_dict())
# SpecialLinkedList(val=1, nexts=[{'val': 2, 'nexts': None}])
Issue is that in 3.10 self-reference hint is a string, lol, which causes this method to fail
def _decode_items(type_args, xs, infer_missing):
"""
This is a tricky situation where we need to check both the annotated
type info (which is usually a type from `typing`) and check the
value's type directly using `type()`.
If the type_arg is a generic we can use the annotated type, but if the
type_arg is a typevar we need to extract the reified type information
hence the check of `is_dataclass(vs)`
"""
def _decode_item(type_arg, x):
if is_dataclass(type_arg) or is_dataclass(xs):
return _decode_dataclass(type_arg, x, infer_missing)
if _is_supported_generic(type_arg):
return _decode_generic(type_arg, x, infer_missing)
return x
if _isinstance_safe(type_args, Collection) and not _issubclass_safe(type_args, Enum):
return list(_decode_item(type_arg, x) for type_arg, x in zip(type_args, xs))
return list(_decode_item(type_args, x) for x in xs)
- python3.10
print(type_args, type(type_args))
SpecialLinkedList <class 'str'>
- python3.11
print(type_args, type(type_args))
<class '__main__.SpecialLinkedList'> <class 'type'>
This is the reason https://peps.python.org/pep-0673/ - in 3.11 they finally added proper self
type
Linked a PR to fix this, will finalize a bit later
@NiroHaim please take a look at the linked PR - should fix this issue