cbaziotis/ekphrasis

spelling correction mostly is not working

stas00 opened this issue · 0 comments

Came to this project for spelling in twitter text, but it doesn't quite work most of the time.

  1. spell correction seems to only work when annotate is set as in the example. Now
    take the same example and set annotate={} and spell correction is gone:
i saw the new john doe movie and it suuuuucks ! ! ! waisted <money> . . . bad movies <annoyed>

if I restore annotate={"hashtag", "...}, then it corrects suuuuucks to sucks
I'm not sure what is the connection between annotations and spell correction.

  1. spelling-correction doesn't work in general. Again, going back to your pipeline example, change the first input sentence to inject some spelling errors: CANT WAIT for the neww seaason of #TwinPeaks , run it, you get:
    cant wait for the neww seaason of twin peaks - i.e. no spell correction.
    The spell_correct_elong doesn't seem to make a difference.

Yet, if I run:

from ekphrasis.classes.spellcorrect import SpellCorrector
sp = SpellCorrector(corpus="english") 
print([sp.correct(x) for x in "neww seaason".split()])

It corrects: ['new', 'season']