quickwit-oss/quickwit

Optimization: filter + scoring by timestamp

Opened this issue · 2 comments

There are several ideas we could leverage to sort by timestamp.

When we sort AND filter a timestamp range, we end up fetching the timestamp twice.

The timestamp is often almost sorted.
A minor amount of metadata could make it possible to restrict our query.

within [t_start, t_end] implies doc in [doc_a, doc_b]

I think this is similar to what I wrote here recently quickwit-oss/tantivy#2352 (comment) :

I've been thinking if we should flag fast fields as almost sorted during creation (e.g. almost sorted in a range of 100 values) and then use that information to do a binary_search + 100 values scan.

The almost sorted check could be done during serialization and should not cost much.

Yes. Let's keep that for later though. :)