/OCR

some fun with OCR stuff

Primary LanguageJupyter Notebook

OCR


Data Generation

Data generated by using generate.py script, that use 25 fonts about and dictionary from ./misc/fonts and ./misc/dict paths respectively, with probability 23 percent it add some random background from ./misc/pictures.

Usage:

python generate.py

will generate 450K around pictures with settings from configs.py in ./data dir. Kinda normalized freq:

&     +
'     +++++++++++++++++++++++++
-     +++++++++++++
.     ++++++++++++++++++++++++++++++++++++++++++++++++
0     +
1     ++++++
2     ++++
3     ++
4     ++
5     +
6     +
8     +
9     +
B     +++++++++++++++++++++++++++++++++++++++++
C     +++++++++++++++++++++++++++++++++++++++++
D     ++++++++++++++++++
E     ++++++++++++++
F     ++++++++++++++++++++++++++++++++++++++++++++
G     +++++
H     ++++++++++++++++++++++++++++++++++++++++++++++++
I     ++++++++++++++++++++++++
J     ++++++++++++++++++++++++++++++++++
K     +++++++++++++++++++++++++++++++++++
L     +++++++++++++++++++++++++++++++++++++++++++
M     ++++++++++++++++++++++++++++++++++++++++
N     ++++++++++++++++++++++++
O     +++++++++++++++++++++++++++++++++++++++++++++++++
P     ++++++++++++++
Q     ++++++++++++++++++++
R     ++++++
S     +++++++++++++++++++++++++++++++++++++++++++++++++
U     +++++++++++++++++++++++++++++++++++++++
V     ++++++++++++++++
W     ++++++++++++++++++++++++++++++++
X     ++++
Y     ++++++++++
Z     +++++++++++++++++++++++++++++++
a     +++++++++
b     ++++++++++++++++
c     +++++++++++++++++
d     +
e     ++++++++++++++++++++++++++++
f     ++++++++++++++++++++++++++++
g     +++++++++++++++++++++++++++
h     +
i     ++++++++++++++++++++++++++++++++++++++++++++++
j     +++++++++++++++++++++++++++++++++++++++++
k     ++++++
l     ++++++++++++++++++++++++++++++++++++++
m     +++++++++++++++++
n     ++++++++++++++++++++++++++++
o     ++++++++++++++++
p     ++++++++++++++++++++++++++++++++++++++++++++++++++
q     +++++++++++
r     +++++++++++++++++++++++++++++++++++++++++++++++++++
s     ++++++++++++++++++++++++++
t     +++++++++++++++++++++++++
u     ++++++++++++++++++
v     ++++++++++++
w     +++++++++++++++++++++++++++++++++++++++++++++++++
x     ++++++++++++
y     ++
z     ++++++++++++++++++++++

Model

Recurrent Convolutional Neural Networks for Text Classification as baseline model. Connectionist Temporal Classification as optimization functional. All source can be found in ./model/crnn.py. Model was trained using Adam with learning rate 1e-3 for the first seven epochs, and then reduce to 1e-4 for the rest 13 epochs. After 20 train procedure was stoped, validation about 85 percent.

Usage:

python train.py

Helpers

  • configs.py, model configs, generator configs etc
  • generator.py, create some data lol
  • helpers.py, some misc stuff plus labels encoder/decoder
  • demo.ipynb, demo notebook for playing with data and model

Some examples

Image Predicted GT
Example 1 volga-moscow volga-moscow
Example 2 VK.com VK.com
Example 6 Gentoo Gentoo
Example 3 abdicant felt-shod
Example 4 abdicant idk lol
Example 5 abdicant woollike