INL/BlackLab

SpansAnd doesn't handle clauses that have multiple hits with same start and end position correctly

Closed this issue · 1 comments

An issue with uniqueness and match info: SpansAnd is written to only take start and end position into account. That is, after a match is found, both clauses are advanced to the next match, then they are once again synchronized. This usually works fine, but if one or both clauses have multiple hits with the same start and end position but different match info, it fails.

Correct would be:

  • implement SpansInBucketsPerStartEndPoint and wrap both clauses in it
  • find matching buckets
  • produce all possible combinations from these buckets

We should probably double check that the produced matches do indeed have different match infos.

Fixed in branch unify-captures-relations.