textexploration/mtas

Question: limiting the number of results to speed up a query

Opened this issue · 0 comments

Is it possible to limit the number of documents that are queried to potentially speed up the query resolution time? We are working with large text-corpora (more than 1 billion words) and would like to quickly obtain at most N results, preferably but not necessary in random order. Our goal is to quickly provide some results, which should be enough for most cases. Now we are using list query for getting a page of results (with start: and number: parameters), but it is slow for queries with millions of matches. As far as we understand, it is because in such cases nearly all documents need to be queried and this takes several minutes on a single machine.