intel/handwritten-chinese-ocr-samples

CER always bigger than 1

IAASSIBLCU opened this issue · 4 comments

help!!!
I try to reproduce this paper, but met some trouble in the training stage. hope to get some suggestion.
background:
Training data set: CASIA_HWDW2.X train dataset (image type is jpg, txt file is written utf-8, windows)
modification in the code: change warp-ctc with PyTorch CTC
problem:
from the paper, we can find the CER will be less than 0.5. However, when I try to training this model and set the batch size in 8,
the CER always be 1.7, and if I set the batch size in 4, the CER always be 1 even it has trained more than 10 epoch.

I have no idea how to with this problem. have anyone reproduce this paper successfully. could you give some suggestions, like the methods of data preprocessing.

image

help!!! I try to reproduce this paper, but met some trouble in the training stage. hope to get some suggestion. background: Training data set: CASIA_HWDW2.X train dataset (image type is jpg, txt file is written utf-8, windows) modification in the code: change warp-ctc with PyTorch CTC problem: from the paper, we can find the CER will be less than 0.5. However, when I try to training this model and set the batch size in 8, the CER always be 1.7, and if I set the batch size in 4, the CER always be 1 even it has trained more than 10 epoch.

I have no idea how to with this problem. have anyone reproduce this paper successfully. could you give some suggestions, like the methods of data preprocessing.

image

Thanks for your interest. :)

First of all, I would comment that this work was developed during the time of my previous job, so I could not share the environment information any more. But in my current spare time, I am trying to improve the repeatability on my own develop machine.

For your experiment, glad to see you've trained it up. From my experience, I would suggest that:

  • You'd better start from the warp-ctc (Long time ago, we tried native pytorch ctc, but the result was not that good as warp-ctc, especially for long sentences.), but of course, this could not be the root reason for your current result.
  • Please check the training stage first. If you can observe the normal training, the test should be the same. As we can see that, the loss after 5 epochs has already been decreased to 20+, this should be good. But the prediction results during the training are STILL in random order. FYI, in our previous experiments, after 5 epochs, the CER could close to 0.15.
  • Double check the test/val dataset, especially the dictionary list and the mapping from images to the labels.

In a word, the CER issue in you post probably caused by the decoding process, which is highly related to the organization of the dataset. Hope these information would help you. Thanks.

Thanks for your help, your suggestions are really helpful. I have trained it up and solved this problem successfully. The root reason is native pytorch ctc. I change it with warp-ctc. What’s more, I moved preds and CTCloss from GPU into CPU. Because the value of ctcloss always be 0 if the preds was in GPU. It seems a native bug of warp-ctc, which many people find in here:
SeanNaren/warp-ctc#102
SeanNaren/warp-ctc#59

Thanks again!!! Have a good day.

Actually, the warpctc can run on GPU, but need keep refreshing the compilation along with the upgrade of CUDA version. For users who need CUDA11+, please refer to below link to recompile the warpctc. Thanks.

SeanNaren/warp-ctc/issues/182

This reply is quite useful to me.