Ability to disable fields from default highlighting
ppf2 opened this issue · 30 comments
Discover screen (saved search) fails to load and throws a cannot highlight field exception when field size is too large.
Segmented Fetch: SearchPhaseExecutionException[Failed to execute phase [query_fetch],
all shards failed; shardFailures {[f9Fe0c9WQ7mhuah00pw8Vw][index_name][0]:
RemoteTransportException[[HOSTNAME][inet[/IP:PORT]][indices:data/read/search[phase/query+fetch]]]; nested:
FetchPhaseExecutionException[[index_name][0]: query[filtered(ConstantScore(cache(_type:ticket)))
->BooleanFilter(+cache(created_at:[1296197280253 TO
1422427680253]))],from[0],size[500],sort[<custom:"created_at":
org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@5a14e4bc>!]: Fetch
Failed [Failed to highlight field [comments.raw]]]; nested:
RuntimeException[org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
can be at most 32766 in length; got 39162]; nested: MaxBytesLengthExceededException[bytes can be
at most 32766 in length; got 39162]; }]
at respond (https://HOSTNAME/index.js?_b=4673:78854:15)
at checkRespForFailure (https://HOSTNAME/index.js?_b=4673:78822:7)
at https://HOSTNAME/index.js?_b=4673:77509:7
at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
at wrappedErrback (https://HOSTNAME/index.js?_b=4673:20773:78)
at https://HOSTNAME/index.js?_b=4673:20906:76
at Scope.$eval (https://HOSTNAME/index.js?_b=4673:21893:28)
at Scope.$digest (https://HOSTNAME/index.js?_b=4673:21705:31)
at Scope.$apply (https://HOSTNAME/index.js?_b=4673:21997:24)
Would be nice to provide an option to exclude the default highlighting for specific field(s).
I see a few problems from a usability perspective:
- We can't detect if a field will be too large, so a failure is necessary first.
- The user is unlikely to know that the field is too large. As far as I know we don't expose the max length of fields anywhere easily accessible?
- The error message isn't parsable, so we can't automatically disable highlighting, nor tell the user what went wrong.
I'd propose fixing this on the elasticsearch side, and creating a lenient highlighting mode that won't throw exception when highlighting fails. Preferably with some communication that we didn't highlight for some reason.
As well as the proposed to change to ES, how about also an advanced config option to disable the highlighting entirely from the Kibana side? Otherwise this kills kibana 4 entirely - there's no way to use it with the same dataset that works just fine in Kibana 3.
thanks.
Until elastic/elasticsearch#5836 is available in ES, I think we can use a workaround in K4 so that the Discover screen will work for these datasets.
+1 on this. Just for aesthetics, I would like to disable highlighting for searches I've place in my dashboard.
edit: you can turn this off in the main.css of kibana
@ppf2 We're hitting this issue quite a bit, can you elaborate on the workaround? Should we rollback to K3?
Same issue here
One way could be to allow the user disabling highlighting completely. That would be my preferred way.
Is that possible?
Hitting this as well with large entries from logstash. Would be nice to be able to disable highlighting.
Hitting this as well. What is the work around ? @ppf2 @rashidkpc. Thanks !
Workaround:
In the big src/public/index.js
, search for highlightTags.pre
. You will find this block of code (mine is at line 149835):
$scope.updateDataSource = Promise.method(function () {
$scope.searchSource
.size($scope.opts.sampleSize)
.sort(getSort($state.sort, $scope.indexPattern))
.query(!$state.query ? null : $state.query)
.highlight({
pre_tags: [highlightTags.pre],
post_tags: [highlightTags.post],
fields: {'*': {}}
})
.set('filter', $state.filters || []);
});
Transform into:
$scope.updateDataSource = Promise.method(function () {
$scope.searchSource
.size($scope.opts.sampleSize)
.sort(getSort($state.sort, $scope.indexPattern))
.query(!$state.query ? null : $state.query)
.highlight({
//pre_tags: [highlightTags.pre],
//post_tags: [highlightTags.post],
//fields: {'*': {}}
})
.set('filter', $state.filters || []);
});
I guess it will not ask ElasticSearch to highlight the fields. It's working for me.
Just not highlighting seems like a pretty terrible solution. It would be great if ElasticSearch would just give up highlighting quietly in this case, rather than throwing an exception. We have some long stack trace messages in our logs which make the whole index unsearchable due to this error.
I noticed this in the index definition:
"message" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
The error in the logs is:
org.elasticsearch.search.fetch.FetchPhaseExecutionException: [logstash-2015.03.18][0]: query[filtered((_all:"exception"))->BooleanFilter(+cache(@timestamp:[1426611659898 TO 1426698059898]))],from[0],size[500],sort[<custom:"@timestamp": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@417f0a12>!]: Fetch Failed [Failed to highlight field [message.raw]]
It seems like the ignore_above: 256
field for message.raw
should prevent that field from being analyzed when it's over 256 chars, non?
+1, ran into this issue today.
@nornagon: The issue is to do with highlighting rather than analysing
I pushed bobrik/kibana4:4.0.1-no-highlighting
on docker hub where highlighting is disabled.
https://github.com/bobrik/docker-kibana4/blob/master/no-highlight.patch
👍
Same problem here.
Having the same issue. I can confirm the workaround given by @ptbrowne is working on Kibana 4.0.1.
Having same issue with 4.0.2. Workaround from @ptbrowne is working on 4.0.2 as well.
Logging a java StackOverflowError will reproduce this consistently.
+1 should be fixed.
+1 same problem here, the workaround from @ptbrowne works for testing
Thanks for the workaround @ptbrowne, I'm using that now as well until this is fixed
Currently marked to be fixed in elasticsearch 1.6.0: elastic/elasticsearch#9881
I'm not sure we can do much about the too-big terms - this is only a problem with old indices, as these terms are not rejected at index time. What we can do, however, is to improve the automatic selection of fields used for highlighting (elastic/elasticsearch#9881). As a side effect this should also avoid triggering the too-big terms because those are typically present in not_analyzed
fields, which are not appropriate for highlighting. We hope to get this in for 1.6 (/cc @brwe)
It is actually not a problem with old indices, one can run into this with the following mapping:
"type": "string",
"analyzer": "keyword",
"ignore_above": 1
The plain highlighter will use the source to highlight and that will still contain terms bigger than 32766 bytes although the term was not indexed.
I opened elastic/elasticsearch#11364 to filter out fields that cannot be highlighted by whatever highlighter is defined and also to ignore the exception thrown in case a too big term is in a document.
Patch for 4.0.3:
--- kibana-4.0.1-linux-x64/src/public/index.js
+++ kibana-4.0.1-linux-x64_huh/src/public/index.js
@@ -122937,10 +122937,10 @@
.sort(getSort($state.sort, $scope.indexPattern))
.query(!$state.query ? null : $state.query)
.highlight({
- pre_tags: [highlightTags.pre],
- post_tags: [highlightTags.post],
- fields: {'*': {}},
- fragment_size: 2147483647 // Limit of an integer.
+// pre_tags: [highlightTags.pre],
+// post_tags: [highlightTags.post],
+// fields: {'*': {}},
+// fragment_size: 2147483647 // Limit of an integer.
})
.set('filter', $state.filters || []);
});
Docker image bobrik/kibana:4.0.3-no-highlighting
is there for you as well.
Thanks @rashidkpc.
elastic/elasticsearch#9881 is now closed but this issue is still open. It looks like it's been fixed in 1.6.1 though. What's the status?
We are using 4.0.2, and used the patch that bobrik posted just above. It still is throwing errors stating the shards are failing. Will that patch only work with 4.0.3? Below is the error we are receiving, any help would be appreciated!
ndex: logstash-2015.10.07 Shard: 1 Reason: FetchPhaseExecutionException[[logstash-2015.10.07][1]: query[filtered((environment:test))->BooleanFilter(+cache(@timestamp:[1286557985677 TO 1444324385677]))],from[0],size[500],sort[<custom:"@timestamp": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@17a826d5>!]: Fetch Failed [Failed to highlight field [message.raw]]]; nested: RuntimeException[org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 317191]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 317191];
Looks like this is fixed in elasticsearch. Closing
How is this fixed? I still get a warnin in Kibana 6.2.2 and the search result is empty.
The length of text to be analyzed for highlighting [17592] exceeded the allowed maximum of [10000] set for the next major Elastic version. For large texts, indexing with offsets or term vectors is recommended!