Deep Learning based Table Detection (LUMINOTH)
This project focuses on "Detection Tables in PDF and Extract contents" by Keras and ObjectTensorFlow Detection API.
The system shall work in 2 steps:
Step 1: Accept document input, read tables: System should have an input mechanism for accepting documents images (TIFF, JPEG). The document may have one or more tables.
Step 2: Step 2: As an output, system should return the table content in an excel format,same as that in the sample data sets
THE DEVELOPING IS ON PROGRESS! THE REPO WILL BE UPDATED SOON, !
Luminoth currently supports Python 2.7 and 3.4–3.6.
To use Luminoth, TensorFlow must be installed beforehand. If you want GPU support, you should install the GPU version of TensorFlow with pip install tensorflow-gpu
, or else you can use the CPU version using pip install tensorflow
.
Just install from PyPI:
pip install luminoth
Optionally, Luminoth can also install TensorFlow for you if you install it with pip install luminoth[tf]
or pip install luminoth[tf-gpu]
, depending on the version of TensorFlow you wish to use.
If you wish to train using Google Cloud ML Engine, the optional dependencies must be installed:
pip install luminoth[gcloud]
First, clone the repo on your machine and then install with pip
:
git clone https://github.com/tryolabs/luminoth.git
cd luminoth
pip install -e .
This system is available under the MIT license. See the LICENSE file for more info.