arrow-py/arrow

span_range returns unexpected result (missing days) when using frame month

Opened this issue · 2 comments

Issue Description

Describe the bug, including details regarding any error messages, version, and platform.

Sometimes (based on the start_date/end_date value), some days are missing when splitting a range of dates using span_range with frame=month

Doc: https://arrow.readthedocs.io/en/latest/api-guide.html#arrow.arrow.Arrow.span_range

Reproduction

import arrow
start_date = arrow.get("2023-01-31T00:00:00+02:00")
end_date = arrow.get("2024-08-20T16:08:09.538605+02:00")
[
    [start.isoformat(), end.isoformat()]
    for start, end in arrow.Arrow.span_range("month", start_date.datetime, end_date.datetime, exact=True)
]

Output

[
	['2023-01-31T00:00:00+02:00', '2023-02-27T23:59:59.999999+02:00'],
	['2023-02-28T00:00:00+02:00', '2023-03-27T23:59:59.999999+02:00'],
	['2023-03-31T00:00:00+02:00', '2023-04-29T23:59:59.999999+02:00'],  # missing 28th to 30th march
	['2023-04-30T00:00:00+02:00', '2023-05-29T23:59:59.999999+02:00'],
	['2023-05-31T00:00:00+02:00', '2023-06-29T23:59:59.999999+02:00'],
	['2023-06-30T00:00:00+02:00', '2023-07-29T23:59:59.999999+02:00'],
	['2023-07-31T00:00:00+02:00', '2023-08-30T23:59:59.999999+02:00'],
	['2023-08-31T00:00:00+02:00', '2023-09-29T23:59:59.999999+02:00'],
	['2023-09-30T00:00:00+02:00', '2023-10-29T23:59:59.999999+02:00'],
	['2023-10-31T00:00:00+02:00', '2023-11-29T23:59:59.999999+02:00'],
	['2023-11-30T00:00:00+02:00', '2023-12-29T23:59:59.999999+02:00'],
	['2023-12-31T00:00:00+02:00', '2024-01-30T23:59:59.999999+02:00'],
	['2024-01-31T00:00:00+02:00', '2024-02-28T23:59:59.999999+02:00'],
	['2024-02-29T00:00:00+02:00', '2024-03-28T23:59:59.999999+02:00'],
	['2024-03-31T00:00:00+02:00', '2024-04-29T23:59:59.999999+02:00'],
	['2024-04-30T00:00:00+02:00', '2024-05-29T23:59:59.999999+02:00'],
	['2024-05-31T00:00:00+02:00', '2024-06-29T23:59:59.999999+02:00'],
	['2024-06-30T00:00:00+02:00', '2024-07-29T23:59:59.999999+02:00'],
	['2024-07-31T00:00:00+02:00', '2024-08-20T16:08:09.538604+02:00'] # missing 30th july
]

Expected result

We should not miss any day from span_range. However, we...
go from 2023-03-27T23:59:59 to 2023-03-31T00:00:00+02:00, missing 3 days (2nd to 3rd value)
go from 2024-07-29T23:59:59 to 2024-07-31T00:00:00+02:00, missing 1 day (last value)

Note :

  • this issue may not occur with different dates
  • this issue does not occur when using frame=months instead of frame=month
  • this issue does not occur when not specifying exact=True

I am unsure of the difference between month and months, but I don't think this is the expected behavior?

System Info

  • 🖥 OS name and version: Windows 10.0.19045 (reproduced on MacOS)
  • 🐍 Python version: 3.9.6 (reproduced on 3.11)
  • 🏹 Arrow version: 1.3.0

Is anyone currently working on this? If not, I'd like to take a look into it. I was also able to reproduce the bug on Ubuntu 22.04.

Is anyone currently working on this? If not, I'd like to take a look into it. I was also able to reproduce the bug on Ubuntu 22.04.

@rkendra not at the moment, feel free to take it up