microsoft/kernel-memory

[Question] MemoryFilter mixing conditions

luismanez opened this issue · 3 comments

Context / Scenario

This issue is kind of a continuation from here: #245

We fixed the issue and it’s fine, but we now have another scenario that cannot be addressed with the current way, and not sure how can be fixed.

Question

Thing is that when we need to apply the security trimming scenario described in the previous issue, and add another filter by a different tag, I’m not able to find a way to do it. What I want is a query like:

(tags/any(s: s search.in(t, 'Authorized:xxxxxx’,’Authorized:xxxxx’,’Authorized:xxxxx’,….))
AND
(tags/any(s: s eq ‘Department:Marketing’))

Any idea how this could be achieved?

I don’t know the possibilities for the other Vector DBs (Redis, Qdrant…), but for Azure Search, where queries can be quite complex, I think we could have some FilterQueryBuilder class that could be injected when adding WithAzureAISearchMemoryDb that return a string with the entire Filter value.

Ideally, I’d rather to have more control in MemoryFilter class with some FluentUI that allows me to build the query operators (this code is crap, but you get the point):

MemoryFilters
    .ByAnyTagValue("Authorized", ["group1", "group2", "group3"]) // translates to search.in()
    .And().ByTag("Department", "IT") // translates to: and department eq IT
    .And().ByTag("Department", "Marketing") // translates to: and department eq Marketing
    .Or().ByTag("Location", "London") // translates to: or location eq London

Thanks

dluc commented
MemoryFilters
/* 1 */    .ByAnyTagValue("Authorized", ["group1", "group2", "group3"]) // translates to search.in()
/* 2 */    .And().ByTag("Department", "IT") // translates to: and department eq IT
/* 3 */    .And().ByTag("Department", "Marketing") // translates to: and department eq Marketing
/* 4 */    .Or().ByTag("Location", "London") // translates to: or location eq London

From a quick look, that translates to two searches, merging the results:

  • search 1: (Authorized == group1 OR Authorized == group2 OR Authorized == group3) AND (Department == IT) AND (Department == Marketing)
  • search 2: (Location == London)

then merge the two result sets, sorting by relevance.

dluc commented

btw, I looked into the option to provide a fluent language and custom filters. For instance, I would love having a "NOT" filter to exclude. For now I'm using the same approach, multiple searches and merging (or subtracting).

Finding a solution that works across all the engines is a major project on its own. E.g. porting all DBs to EF first, and using LINQ. A less expensive option could be allowing to pass query strings, but that is a security nightmare. Another option might be OData, but it would work only for some engines, and OData is kind of picky on allowed symbols.

Other options involve custom interfaces, dependency injection and so on. I think we could try generics, allowing to extend the filtering classes. If you want to give it a shot, without spending more than a couple of hours, we could try and see what the code looks like and how it affects on the rest of the project.

I'll take a deeper look to the Filtering classes and let you know.
Thanks!