Can you share the word crop code
Closed this issue · 5 comments
wushilian commented
In the paper : "We crop from the proposed multilingual dataset. We discard images with widths shorter than 32 pixels as they are too blurry, and obtain 4.1M word images in total."
But I ended up with more than 7 million text line images.
Jyouhou commented
How did you crop the text regions? Did you use axis-aligned boxes or quadrilaterals?
wushilian commented
@Jyouhou I use axis-aligned boxes,and only the rectangle with width and height greater than 32 is reserved
Jyouhou commented
Thanks for the reply.
Most text are highly oriented in the dataset. I filtered by the shortest edge of the quadrilaterals (not the axis-aligned boxes).
Jyouhou commented
Sure. You can send your wechat account to my cmu email: shangbal@cs.cmu.edu