Ability to disable fields from default highlighting

Question

Ability to disable fields from default highlighting

ppf2 opened this issue 10 years ago · 30 comments

Discover screen (saved search) fails to load and throws a cannot highlight field exception when field size is too large.

Segmented Fetch: SearchPhaseExecutionException[Failed to execute phase [query_fetch], 
all shards failed; shardFailures {[f9Fe0c9WQ7mhuah00pw8Vw][index_name][0]: 
RemoteTransportException[[HOSTNAME][inet[/IP:PORT]][indices:data/read/search[phase/query+fetch]]]; nested: 
FetchPhaseExecutionException[[index_name][0]: query[filtered(ConstantScore(cache(_type:ticket)))
->BooleanFilter(+cache(created_at:[1296197280253 TO 
1422427680253]))],from[0],size[500],sort[<custom:"created_at": 
org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@5a14e4bc>!]: Fetch 
Failed [Failed to highlight field [comments.raw]]]; nested: 
RuntimeException[org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
 can be at most 32766 in length; got 39162]; nested: MaxBytesLengthExceededException[bytes can be
 at most 32766 in length; got 39162]; }]
    at respond (https://HOSTNAME/index.js?_b=4673:78854:15)
    at checkRespForFailure (https://HOSTNAME/index.js?_b=4673:78822:7)
    at https://HOSTNAME/index.js?_b=4673:77509:7
    at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
    at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
    at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
    at https://HOSTNAME/index.js?_b=4673:20906:76
    at Scope.$eval (https://HOSTNAME/index.js?_b=4673:21893:28)
    at Scope.$digest (https://HOSTNAME/index.js?_b=4673:21705:31)
    at Scope.$apply (https://HOSTNAME/index.js?_b=4673:21997:24)

Would be nice to provide an option to exclude the default highlighting for specific field(s).

underyx commented 10 years ago

👍

Answer 1 · 2015-01-28T15:00:02.000Z

I see a few problems from a usability perspective:

We can't detect if a field will be too large, so a failure is necessary first.
The user is unlikely to know that the field is too large. As far as I know we don't expose the max length of fields anywhere easily accessible?
The error message isn't parsable, so we can't automatically disable highlighting, nor tell the user what went wrong.

I'd propose fixing this on the elasticsearch side, and creating a lenient highlighting mode that won't throw exception when highlighting fails. Preferably with some communication that we didn't highlight for some reason.

Answer 2 · 2015-01-28T16:07:30.000Z

As well as the proposed to change to ES, how about also an advanced config option to disable the highlighting entirely from the Kibana side? Otherwise this kills kibana 4 entirely - there's no way to use it with the same dataset that works just fine in Kibana 3.

thanks.

Answer 3 · 2015-01-28T17:57:22.000Z

Until elastic/elasticsearch#5836 is available in ES, I think we can use a workaround in K4 so that the Discover screen will work for these datasets.

Answer 4 · 2015-03-05T18:32:40.000Z

+1 on this. Just for aesthetics, I would like to disable highlighting for searches I've place in my dashboard.

edit: you can turn this off in the main.css of kibana

Answer 5 · 2015-03-07T18:35:17.000Z

@ppf2 We're hitting this issue quite a bit, can you elaborate on the workaround? Should we rollback to K3?

Answer 6 · 2015-03-09T12:14:43.000Z

Same issue here

Answer 7 · 2015-03-11T08:30:09.000Z

One way could be to allow the user disabling highlighting completely. That would be my preferred way.
Is that possible?

Answer 8 · 2015-03-13T17:21:23.000Z

Hitting this as well with large entries from logstash. Would be nice to be able to disable highlighting.

Answer 9 · 2015-03-16T16:52:46.000Z

Hitting this as well. What is the work around ? @ppf2 @rashidkpc. Thanks !

Answer 10 · 2015-03-16T20:36:59.000Z

Workaround:
In the big src/public/index.js, search for highlightTags.pre. You will find this block of code (mine is at line 149835):

    $scope.updateDataSource = Promise.method(function () {
      $scope.searchSource
      .size($scope.opts.sampleSize)
      .sort(getSort($state.sort, $scope.indexPattern))
      .query(!$state.query ? null : $state.query)
      .highlight({
        pre_tags: [highlightTags.pre],
        post_tags: [highlightTags.post],
        fields: {'*': {}}
      })
      .set('filter', $state.filters || []);
    });

Transform into:

    $scope.updateDataSource = Promise.method(function () {
      $scope.searchSource
      .size($scope.opts.sampleSize)
      .sort(getSort($state.sort, $scope.indexPattern))
      .query(!$state.query ? null : $state.query)
      .highlight({
        //pre_tags: [highlightTags.pre],
        //post_tags: [highlightTags.post],
        //fields: {'*': {}}
      })
      .set('filter', $state.filters || []);
    });

I guess it will not ask ElasticSearch to highlight the fields. It's working for me.

Answer 11 · 2015-03-18T17:03:51.000Z

Just not highlighting seems like a pretty terrible solution. It would be great if ElasticSearch would just give up highlighting quietly in this case, rather than throwing an exception. We have some long stack trace messages in our logs which make the whole index unsearchable due to this error.

Answer 12 · 2015-03-18T17:08:22.000Z

I noticed this in the index definition:

          "message" : {
            "type" : "string",
            "norms" : {
              "enabled" : false
            },
            "fields" : {
              "raw" : {
                "type" : "string",
                "index" : "not_analyzed",
                "ignore_above" : 256
              }
            }
          },

The error in the logs is:

org.elasticsearch.search.fetch.FetchPhaseExecutionException: [logstash-2015.03.18][0]: query[filtered((_all:"exception"))->BooleanFilter(+cache(@timestamp:[1426611659898 TO 1426698059898]))],from[0],size[500],sort[<custom:"@timestamp": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@417f0a12>!]: Fetch Failed [Failed to highlight field [message.raw]]

It seems like the ignore_above: 256 field for message.raw should prevent that field from being analyzed when it's over 256 chars, non?

Answer 13 · 2015-03-24T07:12:12.000Z

+1, ran into this issue today.

@nornagon: The issue is to do with highlighting rather than analysing

Answer 14 · 2015-03-24T12:15:42.000Z

I pushed bobrik/kibana4:4.0.1-no-highlighting on docker hub where highlighting is disabled.

https://github.com/bobrik/docker-kibana4/blob/master/no-highlight.patch

Answer 15 · 2015-04-09T13:43:58.000Z

I just ran into the same highlighting problem. For me the workaround provided by @ptbrowne doesn't work. Only solution : filter out the fields that are too large. This is really a big problem, I totally agree with @rmoff.

Answer 16 · 2015-04-17T08:49:28.000Z

Same problem here.

Answer 17 · 2015-04-17T09:31:17.000Z

Having the same issue. I can confirm the workaround given by @ptbrowne is working on Kibana 4.0.1.

Answer 18 · 2015-04-28T15:57:34.000Z

Having same issue with 4.0.2. Workaround from @ptbrowne is working on 4.0.2 as well.

Logging a java StackOverflowError will reproduce this consistently.

Answer 19 · 2015-05-04T10:54:03.000Z

+1 should be fixed.

Answer 20 · 2015-05-05T16:24:31.000Z

+1 same problem here, the workaround from @ptbrowne works for testing

Answer 21 · 2015-05-13T14:49:35.000Z

Thanks for the workaround @ptbrowne, I'm using that now as well until this is fixed

Answer 22 · 2015-05-19T16:08:49.000Z

Currently marked to be fixed in elasticsearch 1.6.0: elastic/elasticsearch#9881

Answer 23 · 2015-05-25T12:18:39.000Z

I'm not sure we can do much about the too-big terms - this is only a problem with old indices, as these terms are not rejected at index time. What we can do, however, is to improve the automatic selection of fields used for highlighting (elastic/elasticsearch#9881). As a side effect this should also avoid triggering the too-big terms because those are typically present in not_analyzed fields, which are not appropriate for highlighting. We hope to get this in for 1.6 (/cc @brwe)

Answer 24 · 2015-05-27T07:09:21.000Z

It is actually not a problem with old indices, one can run into this with the following mapping:

        "type": "string",
        "analyzer": "keyword",
        "ignore_above": 1

The plain highlighter will use the source to highlight and that will still contain terms bigger than 32766 bytes although the term was not indexed.
I opened elastic/elasticsearch#11364 to filter out fields that cannot be highlighted by whatever highlighter is defined and also to ignore the exception thrown in case a too big term is in a document.

Answer 25 · 2015-06-10T12:14:03.000Z

Patch for 4.0.3:

--- kibana-4.0.1-linux-x64/src/public/index.js
+++ kibana-4.0.1-linux-x64_huh/src/public/index.js
@@ -122937,10 +122937,10 @@
       .sort(getSort($state.sort, $scope.indexPattern))
       .query(!$state.query ? null : $state.query)
       .highlight({
-        pre_tags: [highlightTags.pre],
-        post_tags: [highlightTags.post],
-        fields: {'*': {}},
-        fragment_size: 2147483647 // Limit of an integer.
+//        pre_tags: [highlightTags.pre],
+//        post_tags: [highlightTags.post],
+//        fields: {'*': {}},
+//        fragment_size: 2147483647 // Limit of an integer.
       })
       .set('filter', $state.filters || []);
     });

Docker image bobrik/kibana:4.0.3-no-highlighting is there for you as well.

Answer 26 · 2015-09-11T23:36:41.000Z

Thanks @rashidkpc.

elastic/elasticsearch#9881 is now closed but this issue is still open. It looks like it's been fixed in 1.6.1 though. What's the status?

Answer 27 · 2015-10-08T17:17:56.000Z

We are using 4.0.2, and used the patch that bobrik posted just above. It still is throwing errors stating the shards are failing. Will that patch only work with 4.0.3? Below is the error we are receiving, any help would be appreciated!

ndex: logstash-2015.10.07 Shard: 1 Reason: FetchPhaseExecutionException[[logstash-2015.10.07][1]: query[filtered((environment:test))->BooleanFilter(+cache(@timestamp:[1286557985677 TO 1444324385677]))],from[0],size[500],sort[<custom:"@timestamp": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@17a826d5>!]: Fetch Failed [Failed to highlight field [message.raw]]]; nested: RuntimeException[org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 317191]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 317191];

Answer 28 · 2015-11-03T14:32:30.000Z

Looks like this is fixed in elasticsearch. Closing

Answer 29 · 2018-03-02T15:21:11.000Z

How is this fixed? I still get a warnin in Kibana 6.2.2 and the search result is empty.

The length of text to be analyzed for highlighting [17592] exceeded the allowed maximum of [10000] set for the next major Elastic version. For large texts, indexing with offsets or term vectors is recommended!