KeyError: '<ï½\x9cendâ\x96\x81ofâ\x96\x81sentenceï½\x9c>' Tokenizer crash
SinanAkkoyun opened this issue · 0 comments
SinanAkkoyun commented
ERROR:waitress:Exception while serving /api/notepad_generate
Traceback (most recent call last):
File "/home/ai/.mconda3/envs/exl2/lib/python3.11/site-packages/waitress/channel.py", line 428, in service
task.service()
File "/home/ai/.mconda3/envs/exl2/lib/python3.11/site-packages/waitress/task.py", line 168, in service
self.execute()
File "/home/ai/.mconda3/envs/exl2/lib/python3.11/site-packages/waitress/task.py", line 456, in execute
for chunk in app_iter:
File "/home/ai/.mconda3/envs/exl2/lib/python3.11/site-packages/werkzeug/wsgi.py", line 256, in __next__
return self._next()
^^^^^^^^^^^^
File "/home/ai/.mconda3/envs/exl2/lib/python3.11/site-packages/werkzeug/wrappers/response.py", line 32, in _iter_encoded
for item in iterable:
File "/home/ai/.mconda3/envs/exl2/lib/python3.11/site-packages/flask/helpers.py", line 115, in generator
yield from gen
File "/home/ai/ml/llm/inference/exl2/ex-ui/backend/notepads.py", line 324, in generate
exclusive_sc.append(tokenizer.extended_piece_to_id[text])
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
KeyError: '<ï½\x9cendâ\x96\x81ofâ\x96\x81sentenceï½\x9c>'
Hi, this happens for DeepSeek models (coder and llm) when clicking Generate
in the notebook. However, I haven't encountered this for clicking >> Token
(yet)