1. References
[1] Attention Is All You Need: https://arxiv.org/abs/1706.03762
[2] Github: https://github.com/pbcquoc/vietocr
[3] Github: https://github.com/VinhLoiIT/vietnamese-htr
2. Dataset
2.1 Todo
2.2 Structure of Configs
dataset/
|
├── annotation/
| |
| └──cin/
| |
│ ├── train.txt
| ├── val.txt
│ └── test.txt
└── images/
|
├── image1.jpg
├── image2.jpg
├── ...
└── imagen.jpg
config/
|
|── vgg_seq2seq.yml
└── vgg_transformer.yml
2.3 Download
- Cinnamon: Handwriting OCR for Vietnamese Address
https://drive.google.com/drive/folders/1Qa2YA6w6V5MaNV-qxqhsHHoYFRK5JB39
- HANDS-VNOnDB: Vietnamese Online Handwriting Database
http://tc11.cvc.uab.es/datasets/HANDS-VNOnDB2018_1/
3. Pretrained Weights
4. Usage
4.1 Todo
- Predicting with batch images.
- Experience with other backbones (vgg11, vgg19, resnet50, resnext50).
4.2 Usage
- Training
CUDA_VISIBLE_DEVICES=<cuda_indice> python train.py --config config/vgg_transformer.yml
- Testing
CUDA_VISIBLE_DEVICES=<cuda_indice> python test.py --config config/vgg_transformer.yml
- Predicting
CUDA_VISIBLE_DEVICES=<cuda_indice> python predict.py --config config/vgg_transformer.yml --image <image_path>