Box Merging leads to mismatch between text and image highlights.
jbaiter opened this issue · 0 comments
jbaiter commented
When a query for individual tokens matches two adjacent tokens, their boxes are merged when building the highlight boxes, but their text highlights remain separated.
{
"text": "<em>Die</em><em>Zahl</em>derer,welchejeneSchreckens: zeitmitAugenſahen,inwelcherZittau, imGefolgedesſiebenjährigenKrieges,den 23.Juli1757,auf<em>die</em>ſchre>li<ſteArt zerſtörtward,kannzwarnurnochklein ſeyn,jedochiſtgewißjedembiedernZit-",
// ...
"highlights": [
[
{
"ulx": 142,
"uly": 720,
"lrx": 348,
"lry": 792,
"text": "Die Zahl",
"parentRegionIdx": 0
}
],
[
{
"ulx": 585,
"uly": 892,
"lrx": 637,
"lry": 929,
"text": "die",
"parentRegionIdx": 0
}
]
]
}
Thanks to @ulb-sa-schmilj for reporting!