Allow sentence-final lower-case i
kravlost opened this issue · 3 comments
kravlost commented
I've been trying to manually tweak the English JSON data to get the tokeniser to recognise ... i.
(a word consisting of a single lower case i) as a valid end of sentence, without success. Any suggestions would be welcome.
neurosnap commented
Greetings! Happy to help.
When you tweak the english JSON file, do you also run make english
before testing?
kravlost commented
Thanks for the suggestion! Ah, no, I'm just loading the JSON training data file as in the example in the README. (On Windows 11.)
kravlost commented
Couldn't get this to work on Windows. I've written a basic splitter instead which works well enough for my needs.