python-pendulum/pendulum

Bug / regression: Unable to parse string '031' with 'DDDD' format using Pendulum 3.0.0

philippe-bollard opened this issue · 1 comments

Context

  • OS version and name: Debian GNU/Linux 11.8 64bits
  • Python version: 3.9.2
  • Pendulum version: 3.0.0

Issue

Pendulum 3 is unable to parse string '031' with 'DDDD' format:

>>> import pendulum
>>> pendulum.__version__
'3.0.0'

>>> p = pendulum.from_format(f"2023-031", 'YYYY-DDDD')

Traceback (most recent call last):

  File "[..]/python3.9/site-packages/pendulum/__init__.py", line 284, in from_format
    def duration(
  File "[..]/python3.9/site-packages/pendulum/formatting/formatter.py", line 416, in parse
    
  File "[..]/python3.9/site-packages/pendulum/formatting/formatter.py", line 482, in _check_parsed
    "{}-{:>03d}".format(validated["year"], parsed["day_of_year"])
  File "[..]/python3.9/site-packages/pendulum/parser.py", line 30, in parse
    
  File "[..]/python3.9/site-packages/pendulum/parser.py", line 43, in _parse
    return pendulum.now()
  File "[..]/python3.9/site-packages/pendulum/parsing/__init__.py", line 78, in parse
    """
  File "[..]/python3.9/site-packages/pendulum/parsing/__init__.py", line 125, in _parse
    # so we fallback on the dateutil parser
pendulum.parsing.exceptions.ParserError: Unable to parse string [2023-031]

but it works for other values:

>>> p = pendulum.from_format(f"2023-030", 'YYYY-DDDD')
>>> p
DateTime(2023, 1, 30, 0, 0, 0, tzinfo=Timezone('UTC'))
>>> p = pendulum.from_format(f"2023-032", 'YYYY-DDDD')
>>> p
DateTime(2023, 2, 1, 0, 0, 0, tzinfo=Timezone('UTC'))

The bug does not seem to be present in the previous version of Pendulum :

>>> import pendulum
>>> pendulum.__version__
'2.1.2'
>>> p = pendulum.from_format(f"2023-031", 'YYYY-DDDD')
>>> p
DateTime(2023, 1, 31, 0, 0, 0, tzinfo=Timezone('UTC'))

It seems that this bug appears for each day representing the last day of each month :

  • "2024-031" - 2024-01-31
  • "2024-060" - 2024-02-29
  • "2024-091" - 2024-03-31
  • etc...

This bug appears when pendulum._pendulum.parse_iso8601 is called in https://github.com/sdispater/pendulum/blob/3.0.0/src/pendulum/parsing/__init__.py#L113
It should return a Date object but instead raises a ValueError.

In [7]: from pendulum._pendulum import parse_iso8601

In [8]: parse_iso8601("2024-090")
Out[8]: datetime.date(2024, 3, 30)

In [9]: parse_iso8601("2024-091")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[9], line 1
----> 1 parse_iso8601("2024-091")

ValueError: day is out of range for month

The error in probably located in https://github.com/sdispater/pendulum/blob/3.0.0/rust/src/parsing.rs#L212

As it is written in Rust, I can't go further in investigating why it's happening