dbt-labs/dbt-core

[Bug] `datetime.datetime.utcnow()` is deprecated as of Python 3.12

dataders opened this issue · 6 comments

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Thanks to Talla in #adapter-ecosytem Slack channel for the heads up

Any usage of datetime.datetime.utcnow() throws the following deprecation warning in Python 3.12

DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).

related: dbt-labs/dbt-common#99

Expected Behavior

This shouldn't happen

Steps To Reproduce

Run our unit test suite with Python 3.12

Relevant log output

No response

Environment

- OS:
- Python: 3.12
- dbt: `1.7.10`

Which database adapter are you using with dbt?

No response

Additional Context

No response

You are the king @dataders 👑

Potential solution here: #5267 (comment)

TLDR

It looks to me like we can preserve the current behavior by updating references to:

datetime.utcnow()

with

datetime.now(timezone.utc).replace(tzinfo=None)

Recommendations

Short term, I'd recommend:

  • Resolve deprecation warnings and preserve current behavior (using naive datetimes everywhere)

Long-term, I'd recommend:

  • Use aware datetimes everywhere (choosing appropriate string formatting upon rendering) as described in #5267

Occurrences

This shows up in our code base in a lot of places:

More detail

# compare_datetimes.py

from datetime import timezone, datetime


# To compare/contrast effectively, set your local system time zone to something _different_ than UTC 

# Naive datetimes -- beware!
naive_1 = datetime.utcnow()  # ❌ current
naive_2 = datetime.now(timezone.utc).replace(tzinfo=None)  # ✅ recommended search/replace
naive_3 = datetime.now()  # extra beware this one!

# Aware datetimes
aware_1 = datetime.now(timezone.utc)  # standardized to UTC
aware_2 = datetime.now().astimezone()  # system local time zone

naive = [naive_1, naive_2, naive_3]
aware = [aware_1, aware_2]

for i in naive + aware:
    # Format the datetime as a string using the ISO 8601 format
    print(i.isoformat())

Run the python script:

python compare_datetimes.py

Output (using America/Boise from tz database):

2024-03-22T16:07:07.997093
2024-03-22T16:07:07.997582
2024-03-22T10:07:07.997598
2024-03-22T16:07:07.997601+00:00
2024-03-22T10:07:07.997602-06:00

TLDR: I've got a fix branch. Warnings coming from dbt-core are delt with, but those from imported packages dbt-common and logbook are pending a fix. Hence no pull request yet.

Hello there,

I found myself with some spare time during the Easter break and decided to follow through on this issue. Please forgive any impudence🙏

@dbeatty10, I second your thoughts. Found very sensible your comment, about decoupling datetime.utcnow() deprecation from features with somewhat unclear scope and action-plan (e.g. #5267).
I went over your compare_datetimes.py script, as well as the official datetime docs. All signs indeed point to datetime.now(timezone.utc).replace(tzinfo=None) as the recommended replacement, mainly because, as you pointed out, it safeguards cases in which dbt executes within environments where assumptions about system time setup might lead to unexpected behavior. It also makes it marginally easier to transition towards aware datetimes later on.

@dataders, to your helpful diagnose, I just wanted to add that besides dbt-common, logbook is the one other package causing the deprecation warnings. In fact, looks like the devs over there had dealt with this topic as of release 1.7.0, by wrapping the recommended replacement statement into a function:

if sys.version_info >= (3, 12):

    def datetime_utcnow():
        return datetime.now(timezone.utc).replace(tzinfo=None)

else:
    datetime_utcnow = datetime.utcnow

thus saving some characters in the places where UTC time is needed, like this:

first_count = now = datetime_utcnow()

In a future where dbt has standalone configuration for users to define the timezone they want for their logs, maybe a simple wrapper like that could help wire the user input across the code base.

But I digress...

To get some results firsts, I just want to help with:

  1. the execution of your recommendations @dbeatty10
  2. bringing a bit of homogeneity to the ways datetime and its classes are being invoked, such that future development would require less effort

With those two goals in mind, I put up fix branch on my fork, all code changes included.
Both unit and integration tests pass. I also created a local HTML coverage report before and after the changes, and saw no coverage degradation whatsoever, so I think codecov won't complain 🤞
EDIT: For good measure, I also built some models with both dbt-postgres and dbt-bigguery (1.8.0b1), then browsed over the resulting manifest.json files and verified that time references were accurate.

The deprecation warnings are gone, except for:

  • Those of dbt-common which probably will be patched as part of issue 99
  • Those of logbook

To address the latter, I tried bumping the package version from:

"logbook>=1.5,<1.6",

to

"logbook>=1.5,<1.7.0.post0",

However, for reasons I still do not understand, tox does not seem to realize I want it to use version 1.7.0 at test time. This is actually the only reason why I did not open a pull request yet. Any suggestions or comments are more than welcomed.

Cheers~

Upon closer look, the issue was not that tox refused to use the correct version of logbook, rather that even the latest version of the logbook package was using another related method, also deprecated (datetime.utcfromtimestamp()).

The folk over there have been informed. Until a new release is done on their end, there is not much that can be done for this issue, so I will keep the versions that is pinned currently (logbook>=1.5,<1.6) and open a pull request with code changes specific only to dbt-core 🙈