/wandb

leveraging Weights & Biases to monitor RAG pipeline

Primary LanguageJupyter Notebook

Installation

Ensure to install the necessary dependencies using the following commands:

!pip install langchain==0.0.350 openai faiss-gpu tiktoken
!pip install wandb

Environment Setup

Set up the required environment variables and configurations:

Setting API Key

Fetch and set the OPENAI_API_KEY environment variable for authentication. This key is essential for OpenAI's functionality and can be retrieved from Google Colab's user data.

Wandb Tracing

Enable the tracing feature for wandb in the langchain library by setting LANGCHAIN_WANDB_TRACING to "true" in the environment.

Document Processing

This code handles a PDF document ("CommonInsuranceTerms.pdf") through the following steps:

Loading the Document

Utilizes langchain.document_loaders.PyPDFLoader to load the PDF file into the environment.

Chunking Text

Divides the loaded document into smaller text chunks using langchain.text_splitter.RecursiveCharacterTextSplitter. These chunks are used for further processing.

Embeddings and Vector Stores

Convert text chunks into embeddings using OpenAI's language model (langchain.embeddings.openai.OpenAIEmbeddings) and store these embeddings in a vector store using langchain.vectorstores.faiss.FAISS.

Question-Answering Chain

Construct a retrieval-based question-answering chain (langchain.chains.RetrievalQA) that uses OpenAI's language model for answering questions. Configuration details include:

  • Language Model: Utilizes OpenAI's language model with a temperature parameter set to 0.3.
  • Chain Type: Configures the chain type as "stuff".
  • Retriever Setup: Configures the chain with a retriever generated from the earlier vector store.

Usage

After setting up the environment and processing the document, you can interact with the QA chain by asking specific questions. For instance:

print(qa_chain.run("What is Mortality Charge?"))
print(qa_chain.run("Tell me about Preferred provider organization (PPO)."))