curiosity-ai/catalyst

Tokenization issue

Opened this issue · 0 comments

Describe the bug
While using EntityRecognition the tokenizing of a string, the returned value doesn't match with what is being sent

To Reproduce
Tokenize value: "Postcode: 0000AA,huis nr. 223."

  1. is tokenized into two tokens and assigned wrong values.
  • "2" --> 11
  • "23." --> P.M..