This repository contains the code that extracts a table from an image and exports it to an Excel. To do this, the image is "read" by an OCR which provides a JSON output which is used as the input to the program. The program then arranges the cells row and column-wise as per the JSON input.
NOTE: Only those input cells read by the OCR will be displayed in the Excel.
os
copy
pandas==0.22.0
openpyxl==2.4.9
You can also use requirements.txt to install the packages. How? Follow this link.
Image -> JSON -> Excel
- First of all, install all the import packages specified in the requirements.txt
- For "reading" an image, use an OCR that converts the format to JSON.
- You can use your own OCR or use Microsoft Azure Cognitive Services OCR API.
- Or you can upload the image at their text reader demo. The demo will give you the JSON of the image. Save the JSON to a notepad and run the program.
- In the program, change the input path and output path according to your requirement.
- Run the program (JSON-to-Excel.py).
Input Image:
It's Corresponding JSON:
Excel Output: