klokantech/osmnames-sphinxsearch

Edinburgh and Buenos Aires low ranked

Closed this issue · 3 comments

It is very hard to find "Edinburgh" in Scotland and "Buenos Aires". It seems something is wrong with the ranking on these - as other hamlets and villages appears in front of these capitals.

Maybe related to:
OSMNames/OSMNames#41

Is there a problem with OSMNames data on these two examples @MartinMikita ?

Upps.

Phrase "Zurich" will not found the city either. See:
http://osmnames.klokantech.com/#q=zurich
only
http://osmnames.klokantech.com/#q=zurich%2C%20switzerland

Phrase "Edinburgh" will not found the city in Scotland.
Only http://osmnames.klokantech.com/#q=Edinburgh%20scotland will.

Phrase "Paris" has the same issue!

THIS IS URGENT @MartinMikita.

I found a problem - overflow of weight value.

http://sphinxsearch.com/docs/current.html#api-func-setfieldweights

There is no enforced limit on the maximum weight value at the moment. However, beware that if you set it too high you can start hitting 32-bit wraparound issues. For instance, if you set a weight of 10,000,000 and search in extended mode, then maximum possible weight will be equal to 10 million (your weight) by 1 thousand (internal BM25 scaling factor, see Section 5.4, “Search results ranking”) by 1 or more (phrase proximity rank). The result is at least 10 billion that does not fit in 32 bits and will be wrapped around, producing unexpected results.