/PDFIntellect

Streamline PDF data retrieval with PDFIntellect, harnessing the intelligence of LLMs via an intuitive Streamlit interface.

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

PDFIntellect: Smart PDF Data Retrieval

Introduction

PDFIntellect is a Streamlit app designed for smart PDF data retrieval. This app leverages Language Models (LLMs) to efficiently extract valuable information from PDF documents.

Features

  • Advanced PDF parsing.
  • Integration with pre-trained Language Models.
  • Customizable cascading LLMs.
  • Intelligent short answer generation.

Requirements

  • Python 3.7 or higher.
  • Streamlit, transformers, torch, and pdfplumber libraries.
  • Access to pre-trained Language Models.
  • PDF parsing libraries.
  • PDF documents for extraction.

Usage

  1. Clone the repository.

    git clone  https://github.com/jaywyawhare/PDFIntellect
  2. Install the required libraries.

    pip install -r requirements.txt
  3. Run the Streamlit app.

    streamlit run app.py

App will be available at http://localhost:8501.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Licence license.