optimaize/language-detector

No way to change default n-gram size from 3 to something else

Opened this issue · 1 comments

n-gram size seems fixed at 3, how can it be changed to a user-specified value?

Class NgramExtractors has

    private static final NgramExtractor STANDARD = NgramExtractor
            .gramLengths(1, 2, 3)
            .filter(StandardNgramFilter.getInstance())
            .textPadding(' ');

so you should be able to make your own like this.
I think bundled models only have 1-grams, 2-grams, and 3-grams.