anoopkunchukuttan/indic_nlp_library

Wrong sentence tokenization of sentences with quotes

GokulNC opened this issue · 0 comments

Example:

>>> sentence_tokenize.sentence_split('He said "Will you bring me some water?". She said "Sure!", and went away.', lang='en')
['He said "Will you bring me some water? ". She said "Sure!',
'", and went away.']

The correct output should have been:

['He said "Will you bring me some water?".',
'She said "Sure!", and went away.']