/OhMyTable

Table Structure Recognition

Primary LanguagePython

OhMyTable

example

Install

pip install ohmytable

Quick Start

Use as a package

from ohmytable import OhMyTable

image_path = "/path/to/your_image_contains_table"
ohmytable = OhMyTable(device="cpu")  # cpu/mps/cuda
htmls = ohmytable(image_path)
# The entire pipeline outputs table structure represented in HTML.
print(htmls)

# Visualize and save the results of all models in the pipeline.
from ohmytable.callback import VisualizeCallback

ohmytable(image_path, callbacks=[VisualizeCallback(image_path, "./tmp")])

Start a gradio web demo:

git clone https://github.com/Sanster/OhMyTable.git
cd OhMyTable
pip install gradio typer
python3 gradio_demo.py

Limitation

  • Table Structure Recognition model is trained with max output length 1024(about 150 table cell boxes.)
  • The current table recognition model's training data contains a lot of dirty data. I may train a new model after cleaning the data.
  • The model works better with less padding around the table image.

Acknowledgement