ahmia/ahmia-site

Try out weighted fields search and compare with current results

chamalis opened this issue · 0 comments

Current search uses a copy_to field that combines title, meta, anchors equally (as far as I can tell). [1]

Then multi_match search is performed on that composed field, discarding stemming, etc [2]

Try to order results based on weighted coefficients, e.g

...
"multi_match": {
    "query": query,
    "type": "most_fields",
    "fields": [
           #"fancy",
           #"fancy.stemmed",
           #"fancy.shingles",
          'title^4', 'meta^2', 'content^2',' anchors^2'
          # TODO find a way to use stemmed, shingles filters here 
    ],
    "minimum_should_match": "75%",
    "cutoff_frequency": 0.01,
}
...

and compare the results.

[1] https://github.com/ahmia/ahmia-index/blob/master/mappings_tor.json#L139
[2] https://github.com/ahmia/ahmia-site/blob/master/ahmia/search/views.py#L82