
TIMEZONE setting unable to handle two-digit UTC offset

Opened this issue · 2 comments

These UTC offsets behave as expected:

import dateparser
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+8'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+08'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+0800'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+8:00'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+08:00'})
dateparser.parse('tomorrow', settings={'TIMEZONE': '+0800'})
dateparser.parse('tomorrow', settings={'TIMEZONE': '+08:00'})

However, if the UTC offset timezone omits both "UTC" and the minutes offset, there will be an error. Example:

import dateparser
dateparser.parse('tomorrow', settings={'TIMEZONE': '+08'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/dateparser/", line 92, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/dateparser/", line 61, in parse
    data = parser.get_date_data(date_string, date_formats)
  File "/usr/lib/python3/dist-packages/dateparser/", line 451, in get_date_data
    parsed_date = _DateLocaleParser.parse(
  File "/usr/lib/python3/dist-packages/dateparser/", line 200, in parse
    return instance._parse()
  File "/usr/lib/python3/dist-packages/dateparser/", line 204, in _parse
    date_data = self._parsers[parser_name]()
  File "/usr/lib/python3/dist-packages/dateparser/", line 224, in _try_freshness_parser
    return freshness_date_parser.get_date_data(self._get_translated_date(), self._settings)
  File "/usr/lib/python3/dist-packages/dateparser/", line 156, in get_date_data
    date, period = self.parse(date_string, settings)
  File "/usr/lib/python3/dist-packages/dateparser/", line 91, in parse
    now = apply_timezone(utc_dt, settings.TIMEZONE)
  File "/usr/lib/python3/dist-packages/dateparser/utils/", line 119, in apply_timezone
    new_datetime = apply_tzdatabase_timezone(date_time, tz_string)
  File "/usr/lib/python3/dist-packages/dateparser/utils/", line 94, in apply_tzdatabase_timezone
    usr_timezone = timezone(pytz_string)
  File "/usr/lib/python3/dist-packages/pytz/", line 201, in timezone
    raise UnknownTimeZoneError(zone)
pytz.exceptions.UnknownTimeZoneError: '+08'

Dateparser should support two-digit UTC offsets because Python standard libraries sometimes return such offsets. For example:

$ TZ=:Asia/Singapore python3
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime

Please fix dateparser so that the TIMEZONE setting is able to handle two-digit UTC offsets such as '+08'.

I suggested @cwfoo open this bug report, but after investigating this issue a little further, it does seem like an odd timezone parameter...

I have a patch for undertime in here that tries to workaround that issue:

I'm not sure what the right way to go here. The best would be for dateparser to accept actual tzinfo objects instead of having to pass them as a string in the environment.

but after investigating this issue a little further, it does seem like an odd timezone parameter...


The best would be for dateparser to accept actual tzinfo objects instead of having to pass them as a string in the environment.

Sounds like a valid enhancement.

Maybe you could edit the title and description of the issue to be about this enhancement. Or close this issue and open a new one about the enhancement.