HusseinYoussef/Arabic-OCR

Train the model

mrunal2401 opened this issue · 4 comments

Hey Hussein, you have made a good project but i want to ask one thing, how to train the model for arabic image/handwritten image or how you trained the model?
Answers are appreciated :)

Thank you.

Thank you for the information. But can you please share how you trained the model and provide steps for the same. Thank you again.

We are using a simple model, a shallow NN I recall, that needs images of single characters for training.

We had a dataset of pages with Arabic text (images and text), and had to break those Arabic pages into single-character images to train the model.
So, we had to break the page to images of lines of text and then break those lines to images words of Arabic text which subsequently we break into many images of single character.

Those images of single character are fed into the model for training. We were using sci-kit learn to build and train the model for simplicity.
We also refer to multiple papers that illustrate the approaches we use for the line, word, and character segmentation.

Thank you so much @HusseinYoussef