qdrant/qdrant

is_empty slow even with index

Closed this issue · 1 comments

Current Behavior

Is empty on a large collection is slow, even after adding an index.

Steps to Reproduce

  1. create a large collection e.g. 36_000_000 entries
  2. add keyword field to 400_000 entries
  3. create index on this field

payload schema:

"always_exists_field":
{
  "data_type": "keyword"
  "points": 36057546
}
"possibly_empty_field":
{
  "data_type": "keyword"
  "points": 449696
}
  1. query / scroll on the field filtering out is_empty with must_not
  2. performance is slow

Expected Behavior

performance should be much faster.

Possible Solution

N/A

Context (Environment)

I am trying to build a system that slowly adds prediction values to the embeddings in qdrant. These can then be used to build a classifier against arbitrary/new points.

Is there maybe a way to acheive the same thing with links?

Detailed Description

N/A

Possible Implementation

N/A

The discord forum is a more appropriate place to ask this question.