collect the important detail from invoice document (pdf)
shq251 opened this issue · 0 comments
shq251 commented
Hi all,
I want to prepare a project to collect the important detail from invoice document pdf (Like, Invoice Number, Date, Total Due, Seller Name etc.) as Key-value pairs.
We prepare the HOCR file from pdf file using OCR engine (Tesseract).
Kindly help us how further proceed with input HOCR file to extract key-value pairs using "catalyst".
Or other approach to prepare Key-value pairs using "catalyst".
Thank in advance.