rrthomas/enchant

Support spell-checking with multiple dictionaries

Closed this issue · 8 comments

Is possible to do this with enchant-2?
I've tried with no success.

  • I know hunspell can do this:
hunspell -d en_US,en_US_med medical.txt
hunspell -d es,en_US spanglish.txt
  • And aspell should do that as well, but I really tried searching examples and experimenting with parameters (like "--extra-dicts"), with no success.

  • From here and my own tests, I found that hunspell is not so good with suggestions (tried English and Spanish words). Because of this, that I'm searching 4 options, like using aspell or enchant-2.

  • Nuspell has some plans to do this in the future.

There's no way to do this with the CLI program, no. I would happily accept a patch that added that ability. The obvious complexity would be what to do about the session word list: ideally one wants to assign each word to one language. With the front end that is not easy. There are various simple solutions I can think of:

  • Assign exclusions/additions to the first language
  • Assign them to all languages (probably the best solution on a per-session basis)
  • Ignore exclusions/inclusions (but then it behaves differently from when only one language is used)

More serious is what happens to exclusions/inclusions for one's PWL. For this you definitely need to know which language to add to/exclude from. I think the best thing would be to extend the protocol to allow the language of inclusions/exclusions to be signalled.

However, merely to support extra languages in non-interactive mode (-l), not even this matters. So it would be possible in the first instance to have a patch that only allows multiple languages with -l.

Of course as a separate matter, applications using libenchant can of course set up multiple languages in various ways.

  • I don't know how hunspell or aspell do that, maybe looking to source code?
  • I think enchant-2 should be able to pass parameters to both (like: hunspell -d es,en_US spanglish.txt), and also aspell.
  • I'm no programmer to do a PR, just I do create this account give feedback to software that I'm interested.
  • From my point of view, should be easy to:
    1. run the spell check with 1st language, then
    2. with unknown words spell check 2nd language, then
    3. words still unknown, search for suggestions in both languages

Thanks for the clarification @Disonantemus. Can I just check, are you simply running enchant-2 from the command line, non-interactively? (As far as I know, that's quite an unusual use case.)

Also see the GSOC2013 work in #34 for doing this "properly" in the library.

Thanks for the clarification @Disonantemus. Can I just check, are you simply running enchant-2 from the command line, non-interactively? (As far as I know, that's quite an unusual use case.)

My use case is vis: A vi-like text editor based on Plan 9's structural regular expressions (less bloated than [n]vim and more features than vi. With vis-spellcheck, that is a plugin that syntax aware spellchecking (with hunspell. aspell and enchant). All from command-line.

And English is not my first/native language, and I think a lot of non native English users needs this multi language feature.

Thanks again! I'll look into how hard it would be to merge the gsoc2013dict branch. (I am guessing it is not trivial, because then I would have done it years ago when I took over the project, so don't be too optimistic!)

See #34 for progress.

Closing this issue in favour of #34.