Could you give me some suggestions for training dataset?
yangxiuwu opened this issue · 1 comments
yangxiuwu commented
Hi, I have trained a Chinese OCR model by CRNN ( 300W synth text image as train dataset). but the model has poor result for the real scene. So could you give me some suggestions for training dataset:
Does the dataset require a fixed aspect ratio?
Does the dataset need some data augment, e.g. transform , blur, different font color and diverse background and so on?
Cocoalate commented
Hi, is your 300W synth text image data public? I'm working on receipt ocr now but my data hasn't been enough.