question about your dataset number?
Johnson-yue opened this issue · 2 comments
@Johnson-yue
Hi, thanks for your interest and the very first question for our paper.
We did not mean "19,514 characters" as the number of images (or glyphs),
but the number of real characters, i.e., Unicode.
Thus, each font has 6,654 images on average, and the union of the Unicode in the train set is 19,514.
On the other hand, the total number of "images" in the train set is 6,654 * 482 = 3,207,228.
Please don't hesitate to bother me if you have any other questions.
Hi, I am making the lmdb file, sorry i am late。
the making lmdb is very slow, every 6734 “image” cost 544 s。
As your lmdb , 482 fonts * 6654 unicode, how much the size of file??
You mean that Unicode that can be used for each font is different. 482 fonts contains 19,514 characters , Yes I understand it, Thank you
btw, I tested the AGIS-net, it is impossible to reimplement their paper performance , by their github repo . After I asked two question, they close the issue...... And thanks for your reply