/formula-recognition-sys

a math-formula image recognition project which placed at the first place in a competition hosted by NAVER CONNECT boostcamp AI Tech

Primary LanguagePython

๐Ÿ†์ˆ˜์‹ ์ธ์‹: To be Modeler and Beyond!

Contents

Task Description

Subject

๋ณธ ๋Œ€ํšŒ์˜ ์ฃผ์ œ๋Š” ์ˆ˜์‹ ์ด๋ฏธ์ง€๋ฅผ LaTex ํฌ๋งท์˜ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฌธ์ œ์˜€์Šต๋‹ˆ๋‹ค. LaTex์€ ๋…ผ๋ฌธ ๋ฐ ๊ธฐ์ˆ  ๋ฌธ์„œ ์ž‘์„ฑ ํฌ๋งท์œผ๋กœ, ์ž์—ฐ ๊ณผํ•™ ๋ถ„์•ผ์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ๊ด‘ํ•™ ๋ฌธ์ž ์ธ์‹(optical character recognition)๊ณผ ๋‹ฌ๋ฆฌ ์ˆ˜์‹์ธ์‹์€ multi-line recognition์„ ํ•„์š”๋กœ ํ•ฉ๋‹ˆ๋‹ค.

์ผ๋ฐ˜์  ๋ฌธ์žฅ๊ณผ ๋‹ฌ๋ฆฌ ์ˆ˜์‹์€ ๋ถ„์ˆ˜์˜ ๋ถ„์žยท๋ถ„๋ชจ, ๊ทนํ•œ์˜ ๊ตฌ๊ฐ„ ํ‘œํ˜„ ๋“ฑ ๋‹ค์ฐจ์›์  ๊ด€๊ณ„ ํŒŒ์•…์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ˆ˜์‹์ธ์‹ ๋ฌธ์ œ๋Š” ์ผ๋ฐ˜์ ์ธ single line recognition ๊ธฐ๋ฐ˜์˜ OCR์ด ์•„๋‹Œ multi line recognition์„ ์ด์šฉํ•˜๋Š” OCR ๋ฌธ์ œ๋กœ ๋ฐ”๋ผ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Multi line recognition์˜ ๊ด€์ ์—์„œ ์ˆ˜์‹ ์ธ์‹์€ ๊ธฐ์กด OCR๊ณผ ์ฐจ๋ณ„ํ™”๋˜๋Š” task๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Data

  • ํ•™์Šต ๋ฐ์ดํ„ฐ: ์ถœ๋ ฅ๋ฌผ ์ˆ˜์‹ ์ด๋ฏธ์ง€ 5๋งŒ ์žฅ, ์†๊ธ€์”จ ์ˆ˜์‹ ์ด๋ฏธ์ง€ 5๋งŒ ์žฅ, ์ด 10๋งŒ ์žฅ์˜ ์ˆ˜์‹ ์ด๋ฏธ์ง€

  • ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ: ์ถœ๋ ฅ๋ฌผ ์ˆ˜์‹ ์ด๋ฏธ์ง€ 6์ฒœ ์žฅ, ์†๊ธ€์”จ ์ˆ˜์‹ ์ด๋ฏธ์ง€ 6์ฒœ ์žฅ

Metric

  • ํ‰๊ฐ€ ์ฒ™๋„: 0.9 ร— ๋ฌธ์žฅ ๋‹จ์œ„ ์ •ํ™•๋„ + 0.1 ร— (1 - ๋‹จ์–ด ์˜ค๋ฅ˜์œจ)

  • ๋ฌธ์žฅ ๋‹จ์œ„ ์ •ํ™•๋„(Sentence Accuracy): ์ „์ฒด ์ถ”๋ก  ๊ฒฐ๊ณผ ์ค‘ ๋ช‡ ๊ฐœ์˜ ์ˆ˜์‹์ด ์ •๋‹ต๊ณผ ์ •ํ™•ํžˆ ์ผ์น˜ํ•˜๋Š” ์ง€๋ฅผ ๋‚˜ํƒ€๋‚ธ ์ฒ™๋„์ž…๋‹ˆ๋‹ค.

  • ๋‹จ์–ด ์˜ค๋ฅ˜์œจ(Word Error Rate, WER): ์ถ”๋ก  ๊ฒฐ๊ณผ๋ฅผ ์ •๋‹ต์— ์ผ์น˜ํ•˜๋„๋ก ์ˆ˜์ •ํ•˜๋Š” ๋ฐ ๋‹จ์–ด์˜ ์‚ฝ์ž…, ์‚ญ์ œ, ๋Œ€์ฒด๊ฐ€ ์ด ๋ช‡ ํšŒ ๋ฐœ์ƒํ•˜๋Š” ์ง€๋ฅผ ์ธก์ •ํ•˜๋Š” ์ฒ™๋„์ž…๋‹ˆ๋‹ค.

Project Result

  • 12ํŒ€ ์ค‘ 1์œ„

  • Public LB Score: 0.8574 / Private LB Score: 0.6288

  • 1๋“ฑ ์†”๋ฃจ์…˜ ๋ฐœํ‘œ ์ž๋ฃŒ๋Š” ์ด๊ณณ์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ˆ˜์‹ ์ธ์‹ ๊ฒฐ๊ณผ ์˜ˆ์‹œ

Installation

# clone repository
git clone https://github.com/bcaitech1/p4-fr-sorry-math-but-love-you.git

# install necessary tools
pip install -r requirements.txt

Dataset Structure

[dataset]/
โ”œโ”€โ”€ gt.txt
โ”œโ”€โ”€ tokens.txt
โ””โ”€โ”€ images/
    โ”œโ”€โ”€ *.jpg
    โ”œโ”€โ”€ ...     
    โ””โ”€โ”€ *.jpg

Code Structure

[code]
โ”œโ”€โ”€ configs/ # configuration files
โ”œโ”€โ”€ data_tools/ # modules for dataset
โ”œโ”€โ”€ networks/ # modules for model architecture
โ”œโ”€โ”€ postprocessing/ # modules for postprocessing during inference
โ”œโ”€โ”€ schedulers/ # scheduler for learning rate, teacher forcing ratio
โ”œโ”€โ”€ utils/ # useful utilities
โ”œโ”€โ”€ inference_modules/ # modules for inference
โ”œโ”€โ”€ train_modules/ # modules for train
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ train.py
โ””โ”€โ”€ inference.py

Command Line Interface

Train

Train with single optimizer

$ python train.py --train_type single_opt --config_file './configs/EfficientSATRN.yaml'

Train with two optimizers for encoder and decoder

$ python train.py --train_type dual_opt --config_file './configs/EfficientSATRN.yaml'

Knowledge distillation training

$ python train.py --train_type distillation --config_file './configs/LiteSATRN.yaml' --teacher_ckpt 'TEACHER-MODEL_CKPT_PATH'

Train with Weight & Bias logging tool

$ python train.py --train_type single_opt --project_name <PROJECTNAME> --exp_name <EXPNAME> --config_file './configs/EfficientSATRN.yaml'

Arguments

train_type (str): ํ•™์Šต ๋ฐฉ์‹
  • 'single_opt': ๋‹จ์ผ optimizer๋ฅผ ํ™œ์šฉํ•œ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • 'dual_opt': ์ธ์ฝ”๋”, ๋””์ฝ”๋”์— optimizer๊ฐ€ ๊ฐœ๋ณ„ ๋ถ€์—ฌ๋œ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • 'distillation': Knowledge Distillation ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
config_file (str): ํ•™์Šต ๋ชจ๋ธ์˜ configuration ํŒŒ์ผ ๊ฒฝ๋กœ
  • ๋ชจ๋ธ configuration์€ ์•„ํ‚คํ…์ฒ˜๋ณ„๋กœ ์ƒ์ดํ•˜๋ฉฐ, ์ด๊ณณ์—์„œ ํ•ด๋‹น ์˜ˆ์‹œ๋ฅผ ๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ์€ EfficientSATRN, EfficientASTER, SwinTRN, LiteSATRN ์ž…๋‹ˆ๋‹ค.
teacher_ckpt (str): Knowledge Distillation ํ•™์Šต ์‹œ ๋ถˆ๋Ÿฌ์˜ฌ Teacher ๋ชจ๋ธ checkpoint ๊ฒฝ๋กœ
project_name (str): (optional) ํ•™์Šต ์ค‘ Weight & Bias ๋กœ๊น… ํˆด์„ ํ™œ์šฉํ•  ๊ฒฝ์šฐ ์‚ฌ์šฉํ•  ํ”„๋กœ์ ํŠธ๋ช…
exp_name (str): (optional) ํ•™์Šต ์ค‘ Weight & Bias ๋กœ๊น… ํˆด์„ ํ™œ์šฉํ•  ๊ฒฝ์šฐ ์‚ฌ์šฉํ•  ์‹คํ—˜๋ช…

Inference

Inference with single model

$ python inference.py --inference_type single --checkpoint <MODELPATH.pth>

Ensemble inference

$ python inference.py --inference_type ensemble --checkpoint <MODEL1PATH.pth> <MODEL2PATH.pth> ...

Arguments

inference_type (str): ์ถ”๋ก  ๋ฐฉ์‹
  • single: ๋‹จ์ผ ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์™€ ์ถ”๋ก ์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • ensemble: ์—ฌ๋Ÿฌ ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์™€ ์•™์ƒ๋ธ” ์ถ”๋ก ์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
checkpoint (str): ๋ถˆ๋Ÿฌ์˜ฌ ๋ชจ๋ธ ๊ฒฝ๋กœ
  • ์•™์ƒ๋ธ” ์ถ”๋ก ์‹œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ชจ๋ธ์˜ ๊ฒฝ๋กœ๋ฅผ ๋‚˜์—ดํ•ฉ๋‹ˆ๋‹ค.

    --checkpoint <MODELPATH_1.pth> <MODELPATH_2.pth> <MODELPATH_3.pth> ...
max_sequence (int): ์ˆ˜์‹ ๋ฌธ์žฅ ์ƒ์„ฑ ์‹œ ์ตœ๋Œ€ ์ƒ์„ฑ ๊ธธ์ด (default. 230)
batch_size (int) : ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ (default. 32)
decode_type (str): ๋””์ฝ”๋”ฉ ๋ฐฉ์‹
  • 'greedy': ๊ทธ๋ฆฌ๋”” ๋””์ฝ”๋”ฉ ๋ฐฉ๋ฒ•์œผ๋กœ ๋””์ฝ”๋”ฉ์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • 'beam': ๋น”์„œ์น˜ ๋ฐฉ๋ฒ•์œผ๋กœ ๋””์ฝ”๋”ฉ์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
decoding_manager (bool): DecodingManager ์‚ฌ์šฉ ์—ฌ๋ถ€
tokens_path (str): ํ† ํฐ ํŒŒ์ผ ๊ฒฝ๋กœ
  • NOTE. DecodingManager๋ฅผ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ์—๋งŒ ํ™œ์šฉ๋ฉ๋‹ˆ๋‹ค.
max_cache (int): ์•™์ƒ๋ธ”('ensemble') ์ถ”๋ก  ์‹œ ์ธ์ฝ”๋” ์ถ”๋ก  ๊ฒฐ๊ณผ๋ฅผ ์ž„์‹œ ์ €์žฅํ•  ๋ฐฐ์น˜ ์ˆ˜
  • NOTE. ๋†’์€ ๊ฐ’์„ ์ง€์ •ํ•  ์ˆ˜๋ก ์ถ”๋ก  ์†๋„๊ฐ€ ๋นจ๋ผ์ง€๋งŒ, ์ผ์‹œ์ ์œผ๋กœ ๋งŽ์€ ์ €์žฅ ๊ณต๊ฐ„์„ ์ฐจ์ง€ํ•ฉ๋‹ˆ๋‹ค.
file_path (str): ์ถ”๋ก ํ•  ๋ฐ์ดํ„ฐ ๊ฒฝ๋กœ
output_dir (str): ์ถ”๋ก  ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•  ๋””๋ ‰ํ† ๋ฆฌ ๊ฒฝ๋กœ (default: './result/')

Collaboration Tools


Github Issues

Github Discussions

Github Pull Requests

Experiments Logging(W&B)

Who Are We?


๊ณ ์ง€ํ˜•
silkstaff@naver.com

๊น€์ค€์ฒ 
ahaampo5@gmail.com

๊น€ํ˜•๋ฏผ
doritos2498@gmail.com

์†ก๋ˆ„๋ฆฌ
nuri3136@naver.com

์ด์ฃผ์˜
vvvic313@gmail.com

์ตœ์ค€๊ตฌ
jungu1106@naver.com