[BUG] Nullable fields are not showing up when deserializing with field(default=None, metadata=config(exclude=lambda x: x is None))
Closed this issue · 4 comments
Description
I'm not sure if I'm doing this correctly,
but my goal is to deserialize json data which has optional properties and when the optional properties are null,
have them not show up in the deserialized version of the data.
Let's say I have this code:
from dataclasses import dataclass, field
from typing import Optional
from dataclasses_json import dataclass_json, LetterCase, config
@dataclass_json(letter_case=LetterCase.CAMEL)
@dataclass
class NewImage:
pk: str = field(metadata=config(field_name="PK"))
sk: str = field(metadata=config(field_name="SK"))
created_by: str
created_date_time: str
optional_attribute_1: Optional[str] = field(default=None, metadata=config(exclude=lambda x: x is None))
optional_attribute_2: Optional[str] = field(default=None, metadata=config(exclude=lambda x: x is None))
So when I receive data that has optional_attribute_1
but doesn't have optional_attribute_2
, it will deserialize without optional attributes.
I've looked at this issue, and that's how they say to ignore null values.
Code snippet that reproduces the issue
from dataclasses import dataclass, field
from typing import Optional
from dataclasses_json import dataclass_json, LetterCase, config
@dataclass_json(letter_case=LetterCase.CAMEL)
@dataclass
class NewImage:
pk: str = field(metadata=config(field_name="PK"))
sk: str = field(metadata=config(field_name="SK"))
created_by: str
created_date_time: str
optional_attribute_1: Optional[str] = field(default=None, metadata=config(exclude=lambda x: x is None))
optional_attribute_2: Optional[str] = field(default=None, metadata=config(exclude=lambda x: x is None))
# i convert my json data to dict before that (i have to)
new_image = {"pk": "1", "sk": "1", "created_by": "blah", "created_date_time": "today", "optional_attribute_1": "blah"}
print(NewImage.from_dict(new_image)) # this will not display optional_attribute_1
Expected
Expecting the deserialized object to have the optional attributes when they are present in serialized form.
NewImage(pk='1', sk='1', created_by='blah', created_date_time='today', optional_attribute_1='blah')
Actual
The optional_attribute_2=None
is present.
NewImage(pk='1', sk='1', created_by='blah', created_date_time='today', optional_attribute_1='blah', optional_attribute_2=None)
Environment description
Python version: 3.11
Click to see packages
boto3==1.34.0
botocore==1.34.0
certifi==2023.11.17
charset-normalizer==3.3.2
dataclasses==0.6
dataclasses-json==0.6.1
dotenv==0.0.5
dynamodb-json==1.3
idna==3.6
jmespath==1.0.1
marshmallow==3.20.1
mypy-extensions==1.0.0
numpy==1.26.2
packaging==23.2
pandas==2.1.4
python-dateutil==2.8.2
python-dotenv==1.0.0
pytz==2023.3.post1
requests==2.31.0
s3transfer==0.9.0
simplejson==3.19.2
six==1.16.0
types-requests==2.31.0.10
typing-inspect==0.9.0
typing_extensions==4.8.0
tzdata==2023.3
urllib3==2.0.7
Updated description: added expected/actual, code highlight, added imports, moved environment details under <details>
tag.
TL;DR
You are confusing the dataclasses
and dataclasses_json
functionality.
Long Read
@yakovsushenok even though I agree with your suggestion (that's the feature I also want to exist), you are misguided. The method you are calling is __repr__
from dataclasses package itself, not the one from dataclasses_json. The last controls only (de)serialization, and the dataclasses handle the rest. The extra parameter exclude
controls only if the field should be present in the serialized data, and __repr__
will always print all fields with repr=True
(enabled by default). You can actually override the __repr__
from the dataclasses package in your class if you want.
Example
Here, take a look:
from dataclasses import dataclass, field
from typing import Optional
from dataclasses_json import dataclass_json, LetterCase, config
@dataclass_json(letter_case=LetterCase.CAMEL)
@dataclass
class ReprTest:
optional_exclude: Optional[str] = field(default=None, metadata=config(exclude=lambda x: x is None))
optional_no_repr: Optional[str] = field(default=None, repr=False)
r1 = ReprTest()
r2 = ReprTest(optional_exclude='one', optional_no_repr='two')
print("FIRST:")
print(r1)
print(r1.to_json())
print()
print("SECOND:")
print(r2)
print(r2.to_json())
Output
FIRST:
ReprTest(optional_exclude=None)
{"optionalNoRepr": null}
SECOND:
ReprTest(optional_exclude='one')
{"optionalExclude": "one", "optionalNoRepr": "two"}
As you can see, __repr__
in both scenarios behaves the same way: always prints optional_exclude
and does not do that for optional_no_repr
.
However, in .to_json()
optionalExclude
is present on the once case and does not for the other while optionalNoRepr
is always present.
If you still want this behaviour, you can use this as a reference:
from abc import ABC
from dataclasses import dataclass, fields, field
from typing import *
@dataclass
class DataclassSmartRepr(ABC):
def __repr__(self):
tokens: List[str] = list()
for f in fields(self):
if (f.repr and (v := getattr(self, f.name, None)) is not None):
tokens.append(f'{f.name}={v!r}')
return f"{type(self).__name__}({', '.join(tokens)})"
@dataclass
class ReprTest:
optional_with_repr_one: Optional[str] = field(default=None, repr=True)
optional_with_repr_two: Optional[str] = field(default=None, repr=True)
optional_no_repr: Optional[str] = field(default=None, repr=False)
__repr__ = DataclassSmartRepr.__repr__
DataclassSmartRepr.register(ReprTest)
r1 = ReprTest()
r2 = ReprTest(optional_with_repr_one='one', optional_with_repr_two='two', optional_no_repr='three')
print("FIRST:")
print(r1)
print()
print("SECOND:")
print(r2)
Output:
FIRST:
ReprTest()
SECOND:
ReprTest(optional_with_repr_one='one', optional_with_repr_two='two')
Thanks @USSX-Hares , I understand now.