/llm

python LLM wrapper to RAG local PDF files

Primary LanguagePythonOtherNOASSERTION

PDF Q&A using Large Language Models (LLMs) from Hugging Face

This script enables you to ask questions about PDF document(s) and get answers generated by a (S)LLM of your choice. It leverages the model's natural language processing capabilities to understand your queries and provide relevant information from the PDF, building a RAG and responds to natural questions.

Features

  • Question-Answering: Ask questions in natural language about the content of your PDF.
  • Hugging Face Integration: Leverages the Hugging Face Transformers library to access a wide range of state-of-the-art LLM models.
  • Sentence Embeddings: Uses sentence embeddings to efficiently find the most relevant parts of the PDF to answer your questions.
  • Automatic Dependency Management: Checks and installs required libraries to ensure a smooth setup.

Requirements

  • Python 3.9 or higher: Please ensure you have a compatible version of Python installed.
  • Hugging Face Account: You'll need a Hugging Face account to access their models. You can create one for free at https://huggingface.co/.
  • Libraries: The following Python libraries are required and will be installed automatically if not present:
    • langchain
    • transformers
    • accelerate
    • bitsandbytes
    • sentence_transformers

Usage

  1. Save the Script: Download this script and save it as pdf_qa.py.

  2. Install Dependencies: Although the script installs and updates all needed libraries, it sometimes fails to do so. In that case open your terminal or command prompt and run:

    pip install -r requirements.txt
  3. Run the Script:

    python3 pdf_qa.py [model_id] [pdf_file_path]
    

    Replace [model_id] with the Hugging Face model ID you want to use (e.g., mistralai/Mistral-7B-Instruct-v0.1). You can find a list of available models at https://huggingface.co/models. Replace [pdf_file_path] with the path to your PDF file(s).

  4. Ask Questions: You'll be prompted to enter questions. Type your questions in natural language and press Enter. The script will provide answers based on the content of the PDF.

  5. Exit: Type exit and press Enter to quit the script.

License

This code is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license. See LICENSE.md for details.

Contributing

Contributions are welcome! Please feel free to fork this repository and submit pull requests.

Disclaimer

This script is provided as-is for educational and personal use. It is not intended for production or commercial applications. The author assumes no liability for any consequences arising from the use of this script.