/PDFProc

PDF processor for extracting data seamlessly for desired formats.

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0


PDFProc

PDFProc is a seamless way to extract, transform, and reassemble (ETR) data stored in PDF documents. For now, PDFProc uses OpenAI's ChatGPT 3.5-turbo LLM via Q&A + Lang Chain to return data based on the relationship with identified tags.

Quickstart

  1. Ensure requirements are installed (virtual environment recommended).
  2. Run python main.py in your terminal.
  3. Coming soon

How it works

Design Doc