GitYCC/crnn-pytorch

need help with input data preparation(training from custom images)

Closed this issue · 3 comments

I have an excel of 10k crops containing two
columns:

Column 1: Image_path (<path/abc.png>)

Column 2: Ground_Truth ()

data.csv looks like:

path | gt

C:/Users/1234/crop/ABC 07 07 2020_page1.png | 8 05 75 824.46Cr
C:/Users/1234/crop/PQW 07 10 2020_page1.png | Time 11 42 23
C:/Users/1234/crop/XRE 08 10 2020_page1.png | Account No. 200000592
C:/Users/1234/crop/JKL 07 10 2020_page1.png | 1 00 00 00 000.00

Now, I need to use this input for the training.
Shall I feed this in dataset.py first ? Then need to start training?.

Or if you can help me with steps.

reference this

You need to prepare annotation_train.txt and annotation_val.txt with format path, index and lexicon.txt.

download by this script