/openai_pdf_qa

Primary LanguageJupyter Notebook

Description

This is a sample project that showcases Open AI capability in Q&A for document understanding using langchain + Chroma DB.

  1. Read PDF
  2. Transform PDF to image
  3. Perform OCR to extract Text
  4. Chunk document into smaller text length
  5. Load Open AI embedding into a vector search DB (ChromaDB)
  6. Query the model
  7. Enjoy the answers!

The following use cases are explored:

  1. Performing Q&A over a construction contract (In English)
  2. Performing Q&A over Panama's Data Protection Law (In Spanish)

This code can run directly in Paperspace Gradient, or Google Colab

---- FOR DEMO USE ONLY NOT FOR COMMERCIAL USE----