OCR using tesseract engine to detect characters from image of a table and save the data in json file.
Input Image:
Expected output:
Eg.
{
"seller": "",
"buyer": "<Value of Buyer Name & Address(Importer of Record) from the document>",
...
}
Clone this github repository and open the file in any IDE of your choice. Install necessary libraries such as pytesseract, opencv. Ensuring you are in the correct file in the terminal or directory. Make sure to change the path of your input image in main.py file.
- openCV
- pandas
- numpy
- json
- Pytesseract
# From the project root directory
Python3 main.py
- usndangol97
If you have any questions, visit my GitHub profile or email me: usndangol97@gmail.com