neurosnap/sentences

Allow sentence-final lower-case i

kravlost opened this issue · 3 comments

I've been trying to manually tweak the English JSON data to get the tokeniser to recognise ... i. (a word consisting of a single lower case i) as a valid end of sentence, without success. Any suggestions would be welcome.

Greetings! Happy to help.

When you tweak the english JSON file, do you also run make english before testing?

Thanks for the suggestion! Ah, no, I'm just loading the JSON training data file as in the example in the README. (On Windows 11.)

Couldn't get this to work on Windows. I've written a basic splitter instead which works well enough for my needs.