Jyouhou/UnrealText

Can you share the word crop code

Closed this issue · 5 comments

In the paper : "We crop from the proposed multilingual dataset. We discard images with widths shorter than 32 pixels as they are too blurry, and obtain 4.1M word images in total."
But I ended up with more than 7 million text line images.

How did you crop the text regions? Did you use axis-aligned boxes or quadrilaterals?

@Jyouhou I use axis-aligned boxes,and only the rectangle with width and height greater than 32 is reserved

Thanks for the reply.

Most text are highly oriented in the dataset. I filtered by the shortest edge of the quadrilaterals (not the axis-aligned boxes).

@Jyouhou Can you share your wechat? It's more convenient to communicate

Sure. You can send your wechat account to my cmu email: shangbal@cs.cmu.edu