XXXX et al. [2004] error
Closed this issue · 2 comments
GabrielLin commented
Describe the bug
Error segmentation
To Reproduce
import pysbd
text = "Yan et al. [2004] analysed SSH variations in northwest Europe and suggested that SSH changes are related to changes in heat content and heat fluxes."
seg = pysbd.Segmenter(language="en", clean=False)
print(seg.segment(text))
This is a whole sentence and should not be segmented.
nipunsadvilkar commented
Hey @GabrielLin this would be considered as an edge case and should be handled at a consumer end. If pysbd happens to break anywhere or gives destructed sentence then it's an issue to be resolved at developer end.
For above cases, you can consume pysbd's output and write your own rules on top of it. I hope this helps!
GabrielLin commented
OK. Thanks. I hope a function can be added to pySBD and it can deal with such a custom rule.