microsoft/kernel-memory

[Bug] Multiple searches for one search call when using minRelevance with Azure AI Search

chaelli opened this issue · 6 comments

Context / Scenario

Using the search endpoint and AzureAISearch as memory (hybrid search enabled).

What happened?

When I use minRelenvance parameter (0.02 in the example), the search takes much longer for the result to load. I check app insights and see, that KM does many searches on the Azure AI Search.
You can see the app insights screen here: https://ibb.co/d2NtxdL

Importance

a fix would make my life easier

Platform, Language, Versions

Linux on Azure; .net core; almost up to date from main branch.

Relevant log output

see screenshot https://ibb.co/d2NtxdL
dluc commented

Hi @chaelli are you setting a limit on the number of records to retrieve, or asking to retrieve all? With a min relevance of 0.02 unless you set a limit you're likely to fetch all records from memory.

@dluc that was my first though as well.. and it will probably be something like that.. but:

  • yes, I use a limit of 5 and also provide that the the SearchAsync method (checked again)
  • I only get 3 documents (1 partition each)
  • aaaand the 0.02 is so low, because the values are really low (I think that has to do with the way hybrid search works with Azure AI Search)
  • when I disable hybrid search its much quicker and I assume does not do that many requests

the longer I think about it the more I think I'm in the wrong repo here.. because the GetSimilarListAsync is only called once... so there might be an issue with the Azure Search client?!

aaah - explaining sometimes helps :D - The issue is that options.Size is within the filters condition =>

when (due to the way hybrid search rates the results) there are tons of results with a very low score (below 0.02) and there is no limit on the search side, it will fetch results until it either has no more or there were at least "limit"-number of results with a high enough relevance.

dluc commented

I recall seeing something to that degree, with the client not stopping even after reaching the number of records requested.

I added some extra code in GetSimilarListAsync to stop, see here:

// In cases where Azure Search is returning too many records

Could you debug that foreach loop and see what is happening?

dluc commented

I just saw your PR, I see the problem, the limit was set only when using filters.
Thanks for sending it

dluc commented

thank you @chaelli !