elastic/kibana

Ability to disable fields from default highlighting

ppf2 opened this issue · 30 comments

ppf2 commented

Discover screen (saved search) fails to load and throws a cannot highlight field exception when field size is too large.

Segmented Fetch: SearchPhaseExecutionException[Failed to execute phase [query_fetch], 
all shards failed; shardFailures {[f9Fe0c9WQ7mhuah00pw8Vw][index_name][0]: 
RemoteTransportException[[HOSTNAME][inet[/IP:PORT]][indices:data/read/search[phase/query+fetch]]]; nested: 
FetchPhaseExecutionException[[index_name][0]: query[filtered(ConstantScore(cache(_type:ticket)))
->BooleanFilter(+cache(created_at:[1296197280253 TO 
1422427680253]))],from[0],size[500],sort[<custom:"created_at": 
org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@5a14e4bc>!]: Fetch 
Failed [Failed to highlight field [comments.raw]]]; nested: 
RuntimeException[org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
 can be at most 32766 in length; got 39162]; nested: MaxBytesLengthExceededException[bytes can be
 at most 32766 in length; got 39162]; }]
    at respond (https://HOSTNAME/index.js?_b=4673:78854:15)
    at checkRespForFailure (https://HOSTNAME/index.js?_b=4673:78822:7)
    at https://HOSTNAME/index.js?_b=4673:77509:7
    at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
    at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
    at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
    at https://HOSTNAME/index.js?_b=4673:20906:76
    at Scope.$eval (https://HOSTNAME/index.js?_b=4673:21893:28)
    at Scope.$digest (https://HOSTNAME/index.js?_b=4673:21705:31)
    at Scope.$apply (https://HOSTNAME/index.js?_b=4673:21997:24)

Would be nice to provide an option to exclude the default highlighting for specific field(s).

I see a few problems from a usability perspective:

  • We can't detect if a field will be too large, so a failure is necessary first.
  • The user is unlikely to know that the field is too large. As far as I know we don't expose the max length of fields anywhere easily accessible?
  • The error message isn't parsable, so we can't automatically disable highlighting, nor tell the user what went wrong.

I'd propose fixing this on the elasticsearch side, and creating a lenient highlighting mode that won't throw exception when highlighting fails. Preferably with some communication that we didn't highlight for some reason.

As well as the proposed to change to ES, how about also an advanced config option to disable the highlighting entirely from the Kibana side? Otherwise this kills kibana 4 entirely - there's no way to use it with the same dataset that works just fine in Kibana 3.

thanks.

ppf2 commented

Until elastic/elasticsearch#5836 is available in ES, I think we can use a workaround in K4 so that the Discover screen will work for these datasets.

+1 on this. Just for aesthetics, I would like to disable highlighting for searches I've place in my dashboard.

edit: you can turn this off in the main.css of kibana

@ppf2 We're hitting this issue quite a bit, can you elaborate on the workaround? Should we rollback to K3?

Same issue here

One way could be to allow the user disabling highlighting completely. That would be my preferred way.
Is that possible?

lwf commented

Hitting this as well with large entries from logstash. Would be nice to be able to disable highlighting.

Hitting this as well. What is the work around ? @ppf2 @rashidkpc. Thanks !

Workaround:
In the big src/public/index.js, search for highlightTags.pre. You will find this block of code (mine is at line 149835):

    $scope.updateDataSource = Promise.method(function () {
      $scope.searchSource
      .size($scope.opts.sampleSize)
      .sort(getSort($state.sort, $scope.indexPattern))
      .query(!$state.query ? null : $state.query)
      .highlight({
        pre_tags: [highlightTags.pre],
        post_tags: [highlightTags.post],
        fields: {'*': {}}
      })
      .set('filter', $state.filters || []);
    });

Transform into:

    $scope.updateDataSource = Promise.method(function () {
      $scope.searchSource
      .size($scope.opts.sampleSize)
      .sort(getSort($state.sort, $scope.indexPattern))
      .query(!$state.query ? null : $state.query)
      .highlight({
        //pre_tags: [highlightTags.pre],
        //post_tags: [highlightTags.post],
        //fields: {'*': {}}
      })
      .set('filter', $state.filters || []);
    });

I guess it will not ask ElasticSearch to highlight the fields. It's working for me.

Just not highlighting seems like a pretty terrible solution. It would be great if ElasticSearch would just give up highlighting quietly in this case, rather than throwing an exception. We have some long stack trace messages in our logs which make the whole index unsearchable due to this error.

I noticed this in the index definition:

          "message" : {
            "type" : "string",
            "norms" : {
              "enabled" : false
            },
            "fields" : {
              "raw" : {
                "type" : "string",
                "index" : "not_analyzed",
                "ignore_above" : 256
              }
            }
          },

The error in the logs is:

org.elasticsearch.search.fetch.FetchPhaseExecutionException: [logstash-2015.03.18][0]: query[filtered((_all:"exception"))->BooleanFilter(+cache(@timestamp:[1426611659898 TO 1426698059898]))],from[0],size[500],sort[<custom:"@timestamp": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@417f0a12>!]: Fetch Failed [Failed to highlight field [message.raw]]

It seems like the ignore_above: 256 field for message.raw should prevent that field from being analyzed when it's over 256 chars, non?

+1, ran into this issue today.

@nornagon: The issue is to do with highlighting rather than analysing

I pushed bobrik/kibana4:4.0.1-no-highlighting on docker hub where highlighting is disabled.

https://github.com/bobrik/docker-kibana4/blob/master/no-highlight.patch

👍

I just ran into the same highlighting problem. For me the workaround provided by @ptbrowne doesn't work. Only solution : filter out the fields that are too large. This is really a big problem, I totally agree with @rmoff.

Same problem here.

Having the same issue. I can confirm the workaround given by @ptbrowne is working on Kibana 4.0.1.

Having same issue with 4.0.2. Workaround from @ptbrowne is working on 4.0.2 as well.

Logging a java StackOverflowError will reproduce this consistently.

+1 should be fixed.

+1 same problem here, the workaround from @ptbrowne works for testing

Thanks for the workaround @ptbrowne, I'm using that now as well until this is fixed

Currently marked to be fixed in elasticsearch 1.6.0: elastic/elasticsearch#9881

I'm not sure we can do much about the too-big terms - this is only a problem with old indices, as these terms are not rejected at index time. What we can do, however, is to improve the automatic selection of fields used for highlighting (elastic/elasticsearch#9881). As a side effect this should also avoid triggering the too-big terms because those are typically present in not_analyzed fields, which are not appropriate for highlighting. We hope to get this in for 1.6 (/cc @brwe)

brwe commented

It is actually not a problem with old indices, one can run into this with the following mapping:

        "type": "string",
        "analyzer": "keyword",
        "ignore_above": 1

The plain highlighter will use the source to highlight and that will still contain terms bigger than 32766 bytes although the term was not indexed.
I opened elastic/elasticsearch#11364 to filter out fields that cannot be highlighted by whatever highlighter is defined and also to ignore the exception thrown in case a too big term is in a document.

Patch for 4.0.3:

--- kibana-4.0.1-linux-x64/src/public/index.js
+++ kibana-4.0.1-linux-x64_huh/src/public/index.js
@@ -122937,10 +122937,10 @@
       .sort(getSort($state.sort, $scope.indexPattern))
       .query(!$state.query ? null : $state.query)
       .highlight({
-        pre_tags: [highlightTags.pre],
-        post_tags: [highlightTags.post],
-        fields: {'*': {}},
-        fragment_size: 2147483647 // Limit of an integer.
+//        pre_tags: [highlightTags.pre],
+//        post_tags: [highlightTags.post],
+//        fields: {'*': {}},
+//        fragment_size: 2147483647 // Limit of an integer.
       })
       .set('filter', $state.filters || []);
     });

Docker image bobrik/kibana:4.0.3-no-highlighting is there for you as well.

davux commented

Thanks @rashidkpc.

elastic/elasticsearch#9881 is now closed but this issue is still open. It looks like it's been fixed in 1.6.1 though. What's the status?

We are using 4.0.2, and used the patch that bobrik posted just above. It still is throwing errors stating the shards are failing. Will that patch only work with 4.0.3? Below is the error we are receiving, any help would be appreciated!

ndex: logstash-2015.10.07 Shard: 1 Reason: FetchPhaseExecutionException[[logstash-2015.10.07][1]: query[filtered((environment:test))->BooleanFilter(+cache(@timestamp:[1286557985677 TO 1444324385677]))],from[0],size[500],sort[<custom:"@timestamp": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@17a826d5>!]: Fetch Failed [Failed to highlight field [message.raw]]]; nested: RuntimeException[org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 317191]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 317191];

Looks like this is fixed in elasticsearch. Closing

T3rm1 commented

How is this fixed? I still get a warnin in Kibana 6.2.2 and the search result is empty.

The length of text to be analyzed for highlighting [17592] exceeded the allowed maximum of [10000] set for the next major Elastic version. For large texts, indexing with offsets or term vectors is recommended!