Some questions about the textlines annotation process

Question

Some questions about the textlines annotation process

Opened this issue 2 years ago · 5 comments

Hello hao,
Thanks for your awesome work for document image dewarping.
Could you provide more details about the textlines annotation process? (e.g., the kernel size of binarization and dilation, and the filter rule)

Answer 1 · 2023-01-05T08:23:44.000Z

Hi, I am sorry for the late reply due to my health.
I use the cv2.adaptiveThreshold for binarization as follows,

cv2.adaptiveThreshold(xxx, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV,ADAPTIVE_WINSZ, 25)

Besides, for dilation, the kernel size is 1 * 10 (h * w).

Answer 2 · 2023-01-06T05:51:25.000Z

Thanks for your reply.
Hope you will get well soon :)
I still have some questions about how you get the ADAPTIVE_WINSZ in cv2.adaptiveThreshold, and how to filter out non-textline connected regions?

Answer 3 · 2023-01-08T07:05:46.000Z

  ADAPTIVE_WINSZ=35

  width and height are the shape of textline candidate 
  if (width < 30) or (height < 2) or (width < 1.5*height):
      this is not a textline

Hope this helps.

Answer 4 · 2023-01-09T09:18:59.000Z

Thank you for sharing the experiment detail!

Answer 5 · 2023-01-10T09:23:30.000Z

@fh2019ustc
I have a question about the localization step of the textlines annotation process.
When creating textline masks, did you fill in all the pixels inside the bounding boxes? Or did you shrink the heights of the bounding boxes so that the textline masks only pass through the middle of the bounding boxes? example