/langchain-RAG-base

Running local Language Learning Models to perform Retrieval-Augmented Generation

Primary LanguagePythonMIT LicenseMIT

Local LLM with RAG

This project is an experimental sandbox for testing out ideas related to running local Large Language Models (LLMs) with Ollama to perform Retrieval-Augmented Generation (RAG) for answering questions based on sample PDFs. In this project, we are also using Ollama to create embeddings with the nomic-embed-text to use with Chroma.

Requirements

  • Ollama verson 0.1.26 or higher.

Setup

  1. Clone this repository to your local machine.
  2. Create a Python virtual environment by running python3 -m venv .venv.
  3. Activate the virtual environment by running source .venv/bin/activate on Unix or MacOS, or .\.venv\Scripts\activate on Windows.
  4. Install the required Python packages by running pip install -r requirements.txt.

Running the Project

Creates embeddings for the provided pdf sources: python3 setup.py -p <pdf_sources>

Spins an chat using the provided pdfs as sources: python3 app.py -p <pdf_sources>

Dockerized setup

Builds image and generate embeddings: sudo docker build -t langchain_rag:0.0.3 --build-arg OLLAMA_HOST=http://<ollama_instance>:11434 .

Starts a jupyter instance on port 5001, the notebook entrypoint allows interacting with the chat: sudo docker run --rm -e OLLAMA_HOST=http://<ollama_instance>:11434 --net host -it langchain_rag:0.0.3

Technologies Used

  • Langchain: A Python library for working with Large Language Model
  • Ollama: A platform for running Large Language models locally.
  • Chroma: A vector database for storing and retrieving embeddings.
  • PyPDF: A Python library for reading and manipulating PDF files.
  • Jupyter: A python based notebook system.