nipunsadvilkar/pySBD

Combination of single quotes prevent sbd

guydepauw opened this issue · 0 comments

Describe the bug
A text containing a particular combination of single quotes doesn't get segmented.

To Reproduce
Steps to reproduce the behavior:
Input text - Come work for us in 'S-Hertogenbosch. To ensure products meet specifications and standards, you will perform in-process inspection. The goal will be to make sure that production procedures will be carried on smoothly to maximize efficiency and profits. where will you work. COMPANY is a global leader in high-end server technology and innovation of IT products. There are also options to work abroad! apply.Are you interested in the position of production operator? Then apply directly via the ''apply'' button below.

Expected behavior
A clear and concise description of what you expected to happen.
Expected output - list of expected sentences

["Come work for us in 'S-Hertogenbosch. ", 'To ensure products meet specifications and standards, you will perform in-process inspection. ', 'The goal will be to make sure that production procedures will be carried on smoothly to maximize efficiency and profits. ', 'where will you work. ', 'COMPANY is a global leader in high-end server technology and innovation of IT products. ', 'There are also options to work abroad! apply.Are you interested in the position of production operator? ', "Then apply directly via the ''apply'' button below."]

Actual output:

["Come work for us in 'S-Hertogenbosch. To ensure products meet specifications and standards, you will perform in-process inspection. The goal will be to make sure that production procedures will be carried on smoothly to maximize efficiency and profits. where will you work. COMPANY is a global leader in high-end server technology and innovation of IT products. There are also options to work abroad! apply.Are you interested in the position of production operator? Then apply directly via the ''apply'' button below."]

Additional context
Removing the first single quote or replacing the 2 single quotes with a double quote resolves the issue.