Non-breaking space character (/u00A0) causes AssertionError
lacymorrow opened this issue · 2 comments
lacymorrow commented
Here is the problem string: Chatbot\u00a0\u2013
Traceback (most recent call last):
File "<console>", line 5, in <module>
File "/usr/local/lib/python3.6/site-packages/budou/parser.py", line 78, in parse
chunks = self.segmenter.segment(source, language)
File "/usr/local/lib/python3.6/site-packages/budou/tinysegmentersegmenter.py", line 94, in segment
assert source[seek] == ' '
AssertionError
budou/budou/tinysegmentersegmenter.py
Line 94 in 87d9b81