Scille/umongo

Date serialization removes microseconds, but not during commit(), only when deserializing from mongo

Closed this issue · 5 comments

This might not be an issue per se, but it's something I found odd and could use some help to understand how best to handle it.

Due to Mongo's own date restrictions to milliseconds, I noticed that any datetime, even with microseconds, will get truncated to the millisecond. So far so good!

However, I noticed that if I get the object, after committing, the date field is not yet truncated. Only when I refetch the document does it get updated correctly. For example, given a class with a default generated UTC date:

class User(Document):
  created_at: fields.AwareDateTimeField(default=lambda: datetime.datetime.now(datetime.timezone.utc))

If I try to create an object User and fetch the created_at field, it'll contain the microseconds:

user = User()
await user.commit()
print(user.created_at)
# 2022-02-02T16:44:02.569516+00:00

However, if I fetch the inserted object, it won't contain microseconds:

user = User()
await user.commit()
inserted_user = await User.find_one({"_id": user.pk})
print(inserted_user.created_at)
# 2022-02-02T16:44:02.569000+00:00

I understand mongo's own limitations, but it seems odd that I'd have to refetch the document right after inserting to be able to get the correct date that got inserted in the database.

Any suggestions or workarounds to get this to work without the extra DB query?

I've now worked around this by having my default= value call a separate method that mimics the_round_to_millisecond method in fields.py:

def get_utc_date():
    now = datetime.datetime.now(datetime.timezone.utc)
    # MongoDB stores datetime with a millisecond precision.
    microseconds = round(now.microsecond, -3)
    if microseconds == 1000000:
        return now.replace(microsecond=0) + datetime.timedelta(seconds=1)
    return now.replace(microsecond=microseconds)

...
created_at = fields.AwareDateTimeField(default=get_mongo_utc_date)
...

Any insight or better suggestions to handle this are very welcome! 🙏

Yeah, the issue here is that the _round_to_millisecond function only applies when setting a value to the field, but not when that value is passed as a default value.

I guess your workaround is the way to go.

I'm open to suggestions about how to improve this, but the "default" feature is pure marshmallow and I don't see how to plug anything here.

Perhaps we could make _round_to_millisecond public so you wouldn't have to duplicate its content.

Perhaps we could make _round_to_millisecond public so you wouldn't have to duplicate its content.

Yeah, this could be a simple utils method accessible to other classes.


How about using a pre_insert override to set the default value instead? Are there any blind spots to doing this, re: performance or best practices, that I might be missing?

Example:

class User(Document):
  created_at: fields.AwareDateTimeField()

  async def pre_insert(self):
      if not self.created_at:
          self.created_at = datetime.datetime.now(datetime.timezone.utc)

The first solution seems clearer to me. The pre_insert one should work fine. I don't think there is a significant perf difference.

got it, thanks! closing this now