Weird results when searching on title with elasticsearch
Opened this issue · 1 comments
The search on title (home page and topoguide search) provides weird results. For instance, the exact word match is often not ranked first. This is probably due to an incorrect configuration of the elasticsearch query in the function get_text_query_on_title() (repository: \v6_api\c2corg_api\search). The results of the search on ngrams and draw are boosted at the expence of the word match. I don't see the reason why the search is done on ngrams and draw !?
One solution to be tested is to remove the fields in order to rely on a simple search on words. When the fuzziness parameter is set to 'auto', the search returns documents that contain terms similar to the search term (cf. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html).
I tried to install the API environment on my laptop (windows 10) but it failed. One could try the following modification of the function get_text_query_on_title(). If it works fine, a proper modification should remove the search_lang variable (impact on UI code to be scrutinized).
def get_text_query_on_title(search_term, search_lang=None):
return MultiMatch(
query=search_term,
fuzziness='auto',
operator='and'
)
Précisions fonctionnelles : https://forum.camptocamp.org/t/topoguide-recherche-par-chaine-de-caracteres-dans-le-titre-plus-assez-selective/319862/69
Quelques pistes :
- faire des explain (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-explain.html) sur les requetes problematiques
- Voir si forcer une langue de recherche (et donc une liste de stop words) peut aider ou non. Pas l'air d'être le cas aujourd'hui depuis l'api.
- Tester une autre mesure de similarité (autre que BM25) pour les recherches limitées au titre