Crash on a string containing a Unicode superscript digit in the middle part of the date
Similacrest opened this issue · 0 comments
Similacrest commented
datefinder==0.7.1
>>> [d for d in datefinder.find_dates("2021-0²-12")]
Traceback (most recent call last):
File "\lib\site-packages\dateutil\parser\_parser.py", line 655, in parse
ret = self._build_naive(res, default)
File "\lib\site-packages\dateutil\parser\_parser.py", line 1238, in _build_naive
if cday > monthrange(cyear, cmonth)[1]:
File "\lib\calendar.py", line 124, in monthrange
raise IllegalMonthError(month)
calendar.IllegalMonthError: bad month number 0; must be 1-12
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
File "\lib\site-packages\datefinder\__init__.py", line 32, in find_dates
as_dt = self.parse_date_string(date_string, captures)
File "\lib\site-packages\datefinder\__init__.py", line 102, in parse_date_string
as_dt = parser.parse(date_string, default=self.base_date)
File "\lib\site-packages\dateutil\parser\_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "\lib\site-packages\dateutil\parser\_parser.py", line 657, in parse
six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
TypeError: unsupported operand type(s) for +: 'int' and 'str'
My understanding is that it's related to either str.isdigit() or regex '\d' including more than just 0-9. Indeed, this also happens with Kharosthi numerals mentioned in the str.isdigit() documentation