Lombiq/DotNest-Support

Lucene analyzer choice

Closed this issue · 0 comments

By fbardet, Tuesday, November 3, 2015 12:58:35 PM
Hello Zoltan,

This is my ultimate request.

I think the analyzer used by Lucene in Dotnest sites is the StandardAnalyzer. This analyzer doesn't manage the accents, like french accents for instance. In my own orchard sites, i use my own analyzer as follows :

public class MyAnalyzer : Analyzer { public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader) { StandardTokenizer tokenizer = new StandardTokenizer(LuceneIndexProvider.LuceneVersion, reader); tokenizer.MaxTokenLength = 255; TokenStream stream = new StandardFilter(tokenizer); stream = new LowerCaseFilter(stream); return new ASCIIFoldingFilter(stream); } }

It would be possible, in the future, to choose between the StandardAnalyzer and the ASCIILowerCaseAnalyzer when enabling the lucene module ?

Best regards.

Frédéric

By Zoltán Lehóczky, Tuesday, November 3, 2015 3:20:06 PM
Hi Frédéric,

yes, DotNest sites use the built-in analyzer Of Orchard. Can you confirm that what you see is this issue? OrchardCMS/Orchard#3887 Because we can make efforts to fix this in Orchard.

BTW feel free to come here with issue/feature requests any time :-). Social Meta Tags is in progress too.

Thanks,

Zoltán

By fbardet, Friday, November 6, 2015 2:51:52 PM
Hello Zoltán,

Yes, it' about the same problem as this issue.

Maybe the best solution is to add a new module named "LuceneASCIILowerCase" for instance in Dotnest sites The administrator may choose then what Lucene module to use ?

Best regards. Frédéric

By Zoltán Lehóczky, Friday, November 6, 2015 11:51:05 PM
Hi Frédéric,

probably the best long-term solution would be to have the configuration in Orchard. But I have a hunch that you need this for French; in this case this module would be a suitable solution, right? https://github.com/Codinlab/Lucene.FrenchAnalyser

Thanks,

Zoltán

By fbardet, Thursday, November 12, 2015 8:35:26 AM
Hello Zoltán,

This solution (https://github.com/Codinlab/Lucene.FrenchAnalyser) seems ok.

Best regards. Frédéric

By Zoltán Lehóczky, Thursday, November 12, 2015 11:16:25 PM
Hi Frédéric,

thank you, great. We started to work on the Orchard fix, we'll see if it can be solved in a good generic way or we'll add the module to the DotNest selection.

Thanks,

Zoltán