ML Tasks RPALabs

Description

OCR using tesseract engine to detect characters from image of a table and save the data in json file.
Input Image:
alt Input Image


Expected output:
Eg.
{
"seller": "",
"buyer": "<Value of Buyer Name & Address(Importer of Record) from the document>",
...
}


Table of contents

Installation

Clone this github repository and open the file in any IDE of your choice. Install necessary libraries such as pytesseract, opencv. Ensuring you are in the correct file in the terminal or directory. Make sure to change the path of your input image in main.py file.

Library Usage

  • openCV
  • pandas
  • numpy
  • json
  • Pytesseract

Demos

# From the project root directory
Python3 main.py

Contributors

  • usndangol97

Questions

If you have any questions, visit my GitHub profile or email me: usndangol97@gmail.com