dice-group/gerbil

Fix overlapping markings in MSNBC dataset

Closed this issue · 1 comments

The MSNBC dataset seems to have some markings, that are overlapping.

example: Bus16451112.txt
from 699, to 710, URI=http://dbpedia.org/resource/Frank_Blake
from 705, to 710, URI=http://dbpedia.org/resource/Frank_Blake

There is already a span merging inside the org.aksw.gerbil.evaluate.impl.SpanMergingEvaluatorDecorator class that could be extracted / reused.