msight-tech/research-charnet

some questions

Opened this issue · 0 comments

Hello author, after reading charnet paper and code, I have some questions:

1. Character Branch

In 3.2. Character Branch of paper, it said:

This branch contains three sub-branches, for text instance segmentation, character detection and character recognition, respectively.

But in the model.py, I didn't find the Text instance segmentation sub-branch as depicted in Figure 2. In your code, it is replaced by a shrunk char region score prediction branch just like EAST model?

Below is some visualizion sample using your pretrained model:
Screenshot from 2019-11-12 17-05-20
Screenshot from 2019-11-12 17-08-50
(I used cv2.applyColorMap(), cv2.addWeighted() and cv2.polylines() for better visualization)
(the angle output is None???)

So, charnet's Character Branch is in fact a EAST-like head(shrunk char score map & geometry map) + char recognition head ?

2. ic15 testset performance

I used the pretrained model and the default config file, the result on ic15 testset is:

precision:0.966   recall:0.744   hmean:0.841

which is far away from the paper report, I noticed that the pred_char_orient in CharDetector class is None. So these open-sourced code is incompleted ?

3. Iterative Character Detection

Iterative Character Detection method is the key for charnet-training in real-world datasets. During each step(2nd~4th step), the parameter of Model A which generates pseudo-gt char-bboxes is fixed, and is different from the Model B to be trained ? or there is only one Model during the whole train schedule?
Looking forward to your reply, thanks!