/TrOCR-Handwritten-Mathematical-Expression-Recognition

Handwritten mathematical symbols recognition with TrOCR

Primary LanguageJupyter Notebook

Description

Generate the math expression LATEX sequence according to the handwritten math expression image.

How to run

  git clone https://github.com/win5923/TrOCR-Handwritten-Mathematical-Expression-Recognition.git
  pip install transformers
  pip install datasets jiwer
  pip install sentencepiece

Train

for Ubuntu you can use screen and run train2.py
for Jupyter you can run train.ipynb

 python train2.py
 python train.ipynb

Inference

use predict.py or test.py to inference on new images.

Evaluate

On CHROME 2016 dataset CER = 0.193
On CHROME 2016 dataset Accuracy = 0.306

Improve

On CROHME 2016 test dataset the Accuracy is worst below image's model. image

Thanks @NielsRogge's Notebook so much.It's very helpful.
https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TrOCR/Fine_tune_TrOCR_on_IAM_Handwriting_Database_using_Seq2SeqTrainer.ipynb