patrickfrey/strus

big token positions

andreasbaumann opened this issue · 1 comments

2016-08-09 10:52:22; strusWebService, error: Token positions of document 693-2009 are out or range (document too big, only 76263 token positions were assigned, maximum allowed position is %65535) (master.cpp:96)

An idea is to have small, big, very big positions in the index. Simply dropping the positions
is not really good. The document is a big PDF, but splitting it creates a clustering and a
"too small retrieval item" problem.

The problem is due to a limit in the blocks storing positions in the storage.
I agree that this must be fixed.