dbmdz/solr-ocrhighlighting

IndexOutOfBoundsException

clorenz opened this issue · 1 comments

The highlighter runs (probably in a very special case) in an IndexOutOfBoundsException, unfortunately, no further info (like the document ID and the minocr file name) is provided:

null:java.lang.IndexOutOfBoundsException: 6131276
	at java.base/java.nio.DirectByteBuffer.get(DirectByteBuffer.java:270)
	at org.mdz.search.solrocr.util.FileBytesCharIterator.adjustOffset(FileBytesCharIterator.java:81)
	at org.mdz.search.solrocr.util.FileBytesCharIterator.subSequence(FileBytesCharIterator.java:136)
	at org.mdz.search.solrocr.formats.OcrPassageFormatter.format(OcrPassageFormatter.java:41)
	at org.mdz.search.solrocr.lucene.OcrFieldHighlighter.highlightFieldForDoc(OcrFieldHighlighter.java:74)
	at org.mdz.search.solrocr.lucene.OcrHighlighter.highlightOcrFields(OcrHighlighter.java:195)
	at org.mdz.search.solrocr.solr.SolrOcrHighlighter.doHighlighting(SolrOcrHighlighter.java:82)
	at org.mdz.search.solrocr.solr.HighlightComponent.process(HighlightComponent.java:95)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)

Closing this for now, since the issue didn't reappear after the latest updates. We now also provide more context in the case of OOB reads, so if the issue should re-appear we will have more information to work with.