This application is a conversational AI assistant that can answer questions based on the contents of PDF documents. It utilizes LangChain, FAISS, Pinecone, and Streamlit to build a chat interface that interacts with users in real time.
- Conversational AI: Ask questions related to the content of the uploaded PDF documents.
- Document Retrieval: Retrieve relevant information from large documents.
- Supports Pinecone and FAISS: Optionally use Pinecone for vector storage or FAISS for local vector storage.
- Streamlit UI: User-friendly interface to interact with the AI assistant.
- Python 3.8 or later
- Virtual environment (optional but recommended)
-
Clone the repository:
git clone https://github.com/nirasha-nelki/Chatbot-PDF-Reader.git cd chatbot-pdf-reader
-
Create and activate a virtual environment:
python -m venv .venv .venv\Scripts\activate
-
Install the required packages:
pip install -r requirements.txt
-
Set up environment variables by creating a
.env
file in the root directory with the following content:PINECONE_API_KEY=your_pinecone_api_key INDEX=your_pinecone_index_name GROQ_API_KEY=your_groq_api_key
-
If you plan to use Pinecone, ensure that the
pincecone_flag
is set toTrue
inbackend.py
. Otherwise, leave it asFalse
to use FAISS.
-
Ingest Documents: Run the following command to ingest documents into Pinecone (if using Pinecone):
python uploaddocs.py
-
Run the Streamlit App: Start the application by running:
streamlit run chatui.py
-
Interact with the Chatbot: Open the provided URL in your web browser, and start asking questions based on the uploaded documents.
- chatui.py: The main Streamlit app file that sets up the chat UI and handles user interactions.
- backend.py: Contains the logic for interacting with the language model, managing chat history, and retrieving relevant document sections.
- uploaddocs.py: Handles document ingestion into Pinecone, splitting the document into manageable chunks, and storing them in the vector store.
- requirements.txt: List of Python packages required to run the application.
- doc/: A folder to store the documents (PDF files) that will be used by the chatbot.
This project uses LangChain, Streamlit, Pinecone, and FAISS.