/pdf_parser

This report use for pdf parser purposes. it can extract text, image, cruve as well as table information.

Primary LanguagePython

pdf_parser

This report use for pdf parser purposes. it can extract text, image, cruve as well as table information.

PDF spliting and image conversion

# make sure pdf folder as well as log directory
pdf_file_path = "data/client"
logs_dir = "logs"

Run below this scirpt

pdf_spliter.py

it will save page by page pdf split using batch method and conver pdf into image

logs
- images
- pdfs

Text Line Extraction

Make sure splited pdf path and image path
pdf_path ="logs/pdf"
img_path ="logs/images"

Run

text_extraction_line.py

![alt text](https://github.com/saiful9379/pdf_parser/logs/360 Painting 2021-FDD (page 126)_116.jpg)