dhlab-epfl/dhSegment

Baselines to Textlines

Opened this issue · 1 comments

@solivr @SeguinBe @raphaelBarman

Once I have detected the baseline masks, now how can I convert that into textline boxes/polygons

dhSegment does not return the bounding boxes of text lines so you need some additional computation. I tried a very basic approach by using the horizontal projection to extract the height of text lines and construct a text box (see here).
A very interesting read is the paper Influence of Text Line Segmentation which goes more in-depth about this topic; maybe you can get some ideas from it.