Can you share the word crop code

Question

Can you share the word crop code

Closed this issue 2 years ago · 5 comments

In the paper : "We crop from the proposed multilingual dataset. We discard images with widths shorter than 32 pixels as they are too blurry, and obtain 4.1M word images in total."
But I ended up with more than 7 million text line images.

Answer 1 · 2021-03-28T03:59:51.000Z

How did you crop the text regions? Did you use axis-aligned boxes or quadrilaterals?

Answer 2 · 2021-03-28T05:53:53.000Z

@Jyouhou I use axis-aligned boxes，and only the rectangle with width and height greater than 32 is reserved

Answer 3 · 2021-03-28T14:14:30.000Z

Thanks for the reply.

Most text are highly oriented in the dataset. I filtered by the shortest edge of the quadrilaterals (not the axis-aligned boxes).

Answer 4 · 2021-03-29T00:59:34.000Z

@Jyouhou Can you share your wechat? It's more convenient to communicate

Answer 5 · 2021-03-29T04:24:57.000Z

Sure. You can send your wechat account to my cmu email: shangbal@cs.cmu.edu