marqo-ai/marqo

[BUG] filters dont work on lexical searches

jess-lord opened this issue · 4 comments

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. run this query on documents that have both a content field and a filename field.
    {
    "q": "RequiredString123",
    "searchMethod": "LEXICAL",
    "searchableAttributes": [
    "content"
    ],
    "limit": 10,
    "showHighlights": true,
    "filter": "filename:Important_File.pdf"
    }

Expected behavior
Only the requested filename should appear in the results

Additional context
TENSOR filters work, but not LEXICAL.

Hi, thanks for submitting the bug. We are working on this and will let you know as soon as possible.

The problem probably stems from an error in parsing the . in the string to be filtered.

Hello @jess-lord

Thank you for bringing this issue to our attention. We have identified that the difference in the tokenization of filter strings between lexical search and tensor search is causing the bugs that you have encountered.

We have plans to make the tokenization consistent in the future map, but for now, we suggest using the following solution to ensure an exact match in lexical search by adding "\ before and after the filter content:

{
  "q": "RequiredString123",
  "searchMethod": "LEXICAL",
  "searchableAttributes": [
    "content"
  ],
  "limit": 10,
  "showHighlights": true,
  "filter": "filename: \"Important_File.pdf\""
}

Please note that no change is necessary for tensor search.

We hope that this solution helps you. If you have any further questions or concerns, please don't hesitate to let us know.

fantastic - thanks @pandu-k and @wanliAlex!