standard_edge_ngram_analyzer not working correctly
Denis2310 opened this issue · 4 comments
Preconditions
Magento Version : 2.4.6-ü4
ElasticSuite Version : 2.11.6
Environment : Developer
Third party modules :
Steps to reproduce
- Go to magento admin
- Create product with name "rotational"
- Set product name attribute to use standard_edge_ngram_analyzer
- Set elasticsuite -> search relevance -> exact match configuration -> Use default analyzer in exact matching filter query -> Yes
- Set elasticsuite -> spellchecking configuration -> term vectors configuration -> Use edge ngram analyzer in term vectors -> Yes
Expected result
- Magento admin -> elasticsuite -> system -> analysis -> select corresponding index -> select standard_edge_ngram_analyzer
- Type rotational keyword
- Tokens: rot, rota, rotat, rotati, rotatio, rotation, rotationa, rotational are shown
Actual result
-
Magento admin -> elasticsuite -> system -> analysis -> select corresponding index -> select standard_edge_ngram_analyzer
-
Type rotational keyword
-
Tokens: rot, rota, rotat, rotat are shown
-
if I test it with different keyword "rotationale" then tokens are: rot, rota, rotat, rotati, rotatio, rotation, rotationa, rotational
-
if I test it with different keyword "rotationals" then tokens are: rot, rota, rotat.
-
if I test it with different keyword "rotationa" then tokens are: rot, rota, rotat, rotati, rotatio, rotation, rotationa
Why it depends on keyword? It is not always that tokens are generated from 3 characters up to whole string.
min_gram = 3
max_gram = 20
Hi,
most probably because the word is stemmed before being sent to edge_n_gram filter : https://github.com/Smile-SA/elasticsuite/blob/2.11.x/src/module-elasticsuite-core/etc/elasticsuite_analysis.xml#L293
Hi @romainruaud what does that stemmed mean? So I should remove that from filter or create a new custom filter?
<filter ref="stemmer_override" />
<filter ref="stemmer" />
https://www.elastic.co/guide/en/elasticsearch/reference/current/stemming.html
You could try to create another analyzer equivalent to the standard_edge_ngram but without the stemmer filter.
Then check on the Analysis screen what will be the output of your words with this filter.
regards
Works fine thanks @romainruaud!