spelling correction mostly is not working
stas00 opened this issue · 0 comments
stas00 commented
Came to this project for spelling in twitter text, but it doesn't quite work most of the time.
- spell correction seems to only work when
annotate
is set as in the example. Now
take the same example and setannotate={}
and spell correction is gone:
i saw the new john doe movie and it suuuuucks ! ! ! waisted <money> . . . bad movies <annoyed>
if I restore annotate={"hashtag", "...}
, then it corrects suuuuucks
to sucks
I'm not sure what is the connection between annotations and spell correction.
- spelling-correction doesn't work in general. Again, going back to your pipeline example, change the first input sentence to inject some spelling errors:
CANT WAIT for the neww seaason of #TwinPeaks
, run it, you get:
cant wait for the neww seaason of twin peaks
- i.e. no spell correction.
Thespell_correct_elong
doesn't seem to make a difference.
Yet, if I run:
from ekphrasis.classes.spellcorrect import SpellCorrector
sp = SpellCorrector(corpus="english")
print([sp.correct(x) for x in "neww seaason".split()])
It corrects: ['new', 'season']