Scaling Up Image-to-LaTeX Performance: Sumen An End-to-End Transformer Model With Large Dataset.
-
To run the model you need Python >= 3.8:
conda create --name img2latex python=3.8 -y
-
Install environment:
pip install -r requirements.txt
or
conda env create -f environment.yml
We provide many Sumen model(base) - 349m params on Hugging Face, which can be downloaded at hoang-quoc-trung/sumen-base.
python train.py --config_path src/config/base_config.yaml --resume_from_checkpoint true
arguments:
-h, --help Show this help message and exit
--config_path Path to configuration file
--resume_from_checkpoint Continue training from saved checkpoint (true/false)
python inference.py --input_image assets/example_1.png --ckpt src/checkpoints
arguments:
-h, --help Show this help message and exit
--input_image Path to image file
--ckpt Path to the checkpoint model
python test.py --config_path src/config/base_config.yaml --ckpt src/checkpoints
arguments:
-h, --help Show this help message and exit
--config_path Path to configuration file
--ckpt Path to the checkpoint model
streamlit run streamlit_app.py --ckpt src/checkpoints
arguments:
-h, --help Show this help message and exit
--ckpt Path to the checkpoint model
or
python gradio_app.py --ckpt src/checkpoints
arguments:
-h, --help Show this help message and exit
--ckpt Path to the checkpoint model
Dataset is available here: Fusion Image To Latex Datasets
The directory data structure can look as follows:
- Save all images in a folder, replace the path as
root
in config file. - Prepare a CSV file with 2 columns:
image_filename
: The name of image file.latex
: Latex code.
Samples:
image_filename | latex |
---|---|
200922-1017-140.bmp | \sqrt { \frac { c } { N } } |
78cd39ce-71fc-4c86-838a-defa185e0020.jpg | \lim_{w\to1}\cos{w} |
KME2G3_19_sub_30.bmp | \sum _ { i = 2 n + 3 m } ^ { 1 0 } i x |
1d801f89870fb81_basic.png | \sqrt { \varepsilon _ { \mathrm { L J } } / m \sigma ^ { 2 } } |