elastic/elasticsearch

API to allow queries to bypass the query cache policy

rendel opened this issue · 5 comments

We have currently a performance issue with the new query cache policy. We have queries that are quite heavy to construct and compute, even on small segments. The UsageTrackingQueryCachingPolicy (which use CacheOnLargeSegments) will always discard the caching of our queries on small segments. This leads to a significant drop of performance (5x to 10x) in our scenarios.
Another limitation of the UsageTrackingQueryCachingPolicy is that there is no easy way to indicate him that our queries are costly to build, apart from subclassing our queries with MultiTermQuery so that it is picked up by the UsageTrackingQueryCachingPolicy#isCostly.
At the moment, the only solution we have is to configure elasticsearch to switch back to the QueryCachingPolicy.ALWAYS_CACHE cache policy.

Related to #16031

@rendel i'm curious as to how you figured out that your queries are heavy to construct on small segments? That seems counterintuitive. Could you provide some examples?

Hi @clintongormley

we have developed a custom query which embeds a large number of terms to perform a semi-join between indexes (see siren-join plugin). The terms are encoded in a byte array for performance consideration, and decoded lazily at query execution time. The decoding of the terms is the heavy part. We are caching them using a cache key. The issue now is that this decoding is always done for small segments.

If a query is slow when it is not cached, I don't think the cache is to blame. It is something that users would hit anyway after a merge or a restart. I actually think not caching on small segments is very important as:

  • it does not affect performance with regular queries
  • it makes memory accounting more accurate (it is easier to account memory usage for a few large cache entries that many tiny entries)
  • it avoids cache churn due to NRT search.

While I think there are things to improve based on the feedback that was given in #16031, I don't think we should make it possible to cache on all segments.

@jpountz I would agree that for mainstream cases - the standard Lucene queries - should not be cached on small segments and that the new caching policy is well adapted for those kind of queries. However, there exist very legitimate cases, when this policy is too restrictive. We are not asking to change the high-level api (e.g., query dsl) but just to give that option at low level for advanced users that - like us - are building on top of Elasticsearch.

Something at the java Lucene Query api level, where people creating a new custom Lucene Query can have somehow some control on the cache policy. Maybe this is something that should be implemented at a Lucene level instead of Elasticsearch ?

Without such a control, we would have to fallback to alternative options that are not very optimal:

  • tell users to activate the index.queries.cache.everything: true setting (but this means that standard queries will not benefit anymore from the cache optimisations introduced by the new caching policy)
  • add and manage a secondary query cache that will cache our custom queries (but this adds unnecessary complexity)
  • be able to change the cache implementation of elasticsearch to introduce our own (but this does not look possible at the moment - we would have to fork elasticsearch)
    What would be the other fallback options available to us ?