qurator-spk/dinglehopper

Improve visual alignment for longer documents

mikegerber opened this issue · 1 comments

@stweil asked in #62:

Unrelated: in the result the lines from GT and OCR result are side by side at the beginning, but that synchronization gets lost later. Why?

The honest answer is: that the lines align nicely in shorter documents is just accidental. The text on the left is just the GT text, the text on the right is just the OCR text.

For larger documents or texts with say larger gaps we would need to make an effort to align the lines.