qdrant/qdrant

Can Qdrant's search API directly handle the nested "if-else" condition in the filter?

Closed this issue · 7 comments

In my experience, Qdrant's search API cannot directly process "if-else" conditions in a filter, so I have to write a logic to process requests by dividing them. Is this intentional? If not, can I create and contribute to related functions?

Here's a pseudocode that only represents logic,

if "A" key in payload: # Filter( has_key="A" )
    must_list.append(FieldCondition ("A-datetime" dict )) 
else:
    must_list.append(FieldCondition ("B-datetime" dict ))
    
Filter = { must_list = must_list }    

I'm sorry for the lack of explanation. 😅

In my current situation, I need a composite filter that is not applied multiple times to apply with other filters such as Top_k. In this regard, if anyone knows how to do it, please help.

Thank you.

Would it be equivalent to something like:

(has_key(A) AND (A-datetime condition)) OR (not has_key(A) and (B-datetime condition)) ?

Yesss. Let me explain it a little more accurately.

All data have "modified-datetime(B-datetime condition)" and some data have "expired-datetime(A-datetime condition)".

# (B-datetime condition)
modified_datetime_range = {
    "key": "modified_datetime",
    "range": {
        "gte": past_time_range.isoformat(), # past_time_range = now_time - relativedelta(months=10)
        "lt": now_time.isoformat(),
    },
}

# (A-datetime condition)
expired_datetime = {
    "key": "expired_datetime",
    "range": {"gte": now_time.isoformat()}

In this environment, how should I write the conditions when searching as follows?

results = client.search(
    collection_name=collection_name,
    query_vector=query_vector,
    query_filter=models.Filter(
        must=[] # How can such conditions fit in here?
    ),
    limit=top_k,
    score_threshold=threshold
)

You can use should instead of must, which is equivalent of the "OR" condition

Thank you for guiding me so kindly!

And, I just learned about 'Nested object filter' through a link, HERE.
Now, I think I can use conditional clauses that include all the nesting.

I'm grateful for this great open source.

Sorry, I just have one more question.

Following your advice, I implemented it this way. However, it seems there is no part that checks for the existence of the key. How can I apply that?

A_filter = models.Filter(
    must=[
        models.Filter(has_key="expiration_date"),
        models.FieldCondition(
            key=expiration_date.get("key"),
            range=models.DatetimeRange(
                gt=expiration_date.get("gt"),
                gte=expiration_date.get("gte"),
                lt=expiration_date.get("lt"),
                lte=expiration_date.get("lte"),
            ),
        ),
    ]
)
B_filter = models.Filter(
    must_not=[models.Filter(has_key="expiration_date")],
    must=[
        models.FieldCondition(
            key=expiration_date.get("key"),
            range=models.DatetimeRange(
                gt=datetime_values.get("gt"),
                gte=datetime_values.get("gte"),
                lt=datetime_values.get("lt"),
                lte=datetime_values.get("lte"),
            ),
        )
    ],
)
should_list.append(A_filter)
should_list.append(B_filter)
# In search
query_filter=models.Filter(should=should_list)

I would appreciate it if you could assume that these have each been applied with index.

You can try to use is_empty condition instead of has_key

I resolved it using the method you described. Thank you!

for datetime in datetime_list:
    datetime_key = datetime.get("key")
    datetime_range = datetime.get("range", {})
    if datetime_key == mustkey:
        filter_tmp = models.Filter(
            must_not=[
                models.IsEmptyCondition(
                    is_empty=models.PayloadField(key=mustkey)
                ),
            ],
            must=[
                models.FieldCondition(
                    key=datetime_key,
                    range=models.DatetimeRange(
                        gt=datetime_range.get("gt"),
                        gte=datetime_range.get("gte"),
                        lt=datetime_range.get("lt"),
                        lte=datetime_range.get("lte"),
                    ),
                ),
            ],
        )
    else:
        filter_tmp = models.Filter(
            must=[
                models.IsEmptyCondition(
                    is_empty=models.PayloadField(key=mustkey)
                ),
                models.FieldCondition(
                    key=datetime_key,
                    range=models.DatetimeRange(
                        gt=datetime_range.get("gt"),
                        gte=datetime_range.get("gte"),
                        lt=datetime_range.get("lt"),
                        lte=datetime_range.get("lte"),
                    ),
                ),
            ]
        )
    date_should_list.append(filter_tmp)
# in search, must=must_list
must_list.append(models.Filter(should=date_should_list))

https://qdrant.tech/documentation/concepts/filtering/#is-empty

Mentioning the existence of keys in the payload might be helpful to someone.
I've resolved it thanks to your help, and I think I'll be able to use qdrant more useful in the future.

Thank you. 🙌