OpenTaal/opentaal-hunspell

Wrongly reports misspelling when word has a / character around.

Closed this issue · 1 comments

I've run into the following problem. When a word starts with or ends with a / or + character, hunspell reports this as an error.
Example:
echo 'Een boer en een jongen. /jongen/ /boer/' | hunspell -l -d nl_NL /dev/stdin

On a language like en_US for example this problem doesn't occur. Any idea why this is happening?

EDIT: I see these characters are defined in WORDCHARS '’0123456789ij.-\/+₂²€@ while in en_US this is just WORDCHARS 0123456789’. When using hunspell-nl in org-mode in Emacs, this definition causes wrongly reported misspellings. Luckily the syntax table can be edited to work around this problem.

Yes, that is the reason as these are valid characters in Dutch words we support. You can fix this with preprocessing what is send to the spelling checker. Huspell and ispell support some text formats, Nuspell supports only clean text. There are so many text formats that it is very hard to support them all and is out of scope for most spelling checkers. What is the format you are using here? Perhaps Emacs can handle it and extract clean text from it.