clovaai/lffont

Training interrupted by validation

zha-hengfeng opened this issue · 10 comments

During training phase 1, the program was interrupted due to an error during Validation.

image

8uos commented

Hi, I checked the error and updated the code.
You can check the difference in the latest commit.
Sorry for the inconvenience.

Thank you for your help, It's a great job. But I have other questions.

(1) issue #1 you mentioned that the dataset is 6,654 * 482 = 3,207,228. You used 6654 * 371 for training and 15 * (2615 + 280) for validation evaluation. Is this correct for me to understand?

(2) the content font is not included in the training and test sets (train.json and test.json), you only configured it in cfgs(0250_simple). Does the content font only need to be written to lmdb via meta_file and not appear in train.json ?

8uos commented

(1) It is almost right but the training dataset does not contain exactly that number of images.
In the training dataset, each font covers varying numbers of characters because of rendering issues and 6,654 is the average. So, the total number of images can be slightly different from that value (6654 * 371).

(2) It is right. The content font does not appear in train.json because it should be excluded from the reference styles of training and test sets. However, it still needs to be written to lmdb to render source images.

Hi, I checked the error and updated the code.
You can check the difference in the latest commit.
Sorry for the inconvenience.

Sorry to bother you again, there seems to be a problem with the code in evaluator.py. I am now valid that all fonts in the set generate the same image. Lines 71-72 of the code should probably be changed to the following:
if trg_imgs:
for trg_img in trg_imgs:
trgs.append(trg_img.detach().cpu())

Maybe there's something wrong with my modifications, too, but that's where the problem should be.

8uos commented

Lines 71-72 of evaluator.py are not wrong ([0] is added to line 72 because of *trg_imgs in line 61).
Also, they cannot cause the issue that you described because trg_imgs are the ground truth images (the upper images), not the generated images.

How do your generated images look like?
I have checked and run the code, but my model seems like working well.

Before the modification it looked like this. All results seem to be generated for the first line of the font. So I think it's because the code only saves the first image of the font from trg_imgs.
334cd5271385c753a43218719

I modified evaluator.py and it works fine.
infoflow 2021-07-13 10-06-23

@zha-hengfeng
Hi, your modification is fundamentally identical to the original implementation:

if trg_imgs:
  for trg_img in trg_imgs:
    trgs.append(trg_img.detach().cpu())

Because trg_imgs is a list with only one element.
We will not revise the code because I don't think there is a bug in the code.
Thanks for your comments, and we hope our implementation helps your own applications.

Thank you again. Your work is great and has helped me a lot.

Now I find that the phase1 model will fail in the test. And the code doesn't to generate a new glyph for the input reference glyphs when testing.

I would like to know what kind of forged glyph the model will generate for the glyph with input reference font.

@zha-hengfeng
Hi,
(1) we have the two-phase training procedure for generating the unseen components in your reference set.
(2) if you only have the phase 1 model, you cannot generate a glyph with unseen components in the reference, for example:

  • Reference characters: ["丠", "垙"] => (can be decomposed into ["爿", "匕", "一", "土", "⺌", "一", "儿"]
  • You can generate characters such as "壯" => (decomposed into ["爿", "土"])
  • You cannot generate characters such as "暛" => (decomposed into ["日", "羊", "工")
    (3) phase-2 training solves the problem in (2). Hence you have to train your model with phase-2 if you want to generate unseen glyphs

We already publish the technical details in our paper (https://arxiv.org/abs/2009.11042).
You can check the details in our paper as well

Closing the issue, assuming the answer resolves the problem.
Please re-open the issue as necessary.