LLM simple QnA example

This repository contains the example of the simple QnA system based on the LLM and LangChain. As the vector search engine I have used Qdrant.

Setup python environment

Note

This project is based on Python 3.10 and uses poetry to manage dependencies.

Clone the repository using git clone command.
Open the terminal and go to the project directory using cd command.
Create virtual environment using python -m venv venv or conda create -n venv python=3.10 command.
Activate virtual environment using source venv/bin/activate or conda activate venv command.
Install poetry using instructions from here. Use with the official installer section.
Set the following option to disable new virtualenv creation:
```
poetry config virtualenvs.create false
```
Install dependencies using poetry install --no-root command. The --no-root flag is needed to avoid installing the package itself.
Setup pre-commit hooks using pre-commit install command. More information about pre-commit you can find here.
Run the test to check the correctness of the project work using following command:
```
python -m unittest -b
```
After successful passing of the tests, you can work with the project!
If you want to add new dependencies, use poetry add <package_name> command. More information about poetry you can find here.
If you want to add new tests, use unittest library. More information about unittest you can find here. All tests should be placed in the tests directory.
All commits should be checked by pre-commit hooks. If you want to skip this check, use git commit --no-verify command. But it is not recommended to do this.
Also, you can run pre-commit hooks manually using pre-commit run --all-files command.
More useful commands you can find in Makefile.

Examples

Warning

You should have wget to files and docker to run Qdrant and Redis services.

How to start?

Important

Knowledge base for RAG consists of two parts: pdf files and data from the wikipedia. First part will be downloaded manually by the script and second part will be downloaded from the code.

Run make download_dataset command to download the pdf files. Those files will be placed in the data directory and it only a one part of the full dataset that will be used.
Run make run_qdrant command to start the Qdrant service. It will be available on http://localhost:6333 address.
Run make run_redis command to start the Redis service. It will be necessary to cache the LLM prompts and requests in step 5.
Run the notebook notebooks/01-open-source-llms-langchain.ipynb to download the full dataset, index it and run the QnA example with local LLMs. This notebook show only simple example of the QnA system.
Run the notebook notebooks/02-open-ai-langchain-base.ipynb to download the full dataset, index it and run the QnA example with OpenAI LLMs. This example is more complex and show how to evaluate the QnA system.
I can run REST API for QnA system using make run_api command. Swagger will be available on http://localhost:8000/docs address.

Useful links

Mastering RAG: How To Architect An Enterprise RAG System - this article describes how to build the RAG system for enterprise and 7 Failure Points of RAG systems.
Using langchain for Question Answering on Own Data - good introduction into the LangChain and QnA systems. Contains a lot of useful diagrams.
RAG: How to Talk to Your Data - yet another good introduction into the RAG systems with useful examples.
HOW-TO: Build a ChatPDF App over millions of documents with LangChain and MyScale in 30 Minutes - just a quick example of how to build a QnA system using LangChain.
How LangChain Implements Self Querying - into the details of how LangChain implements self-querying.
SPLADE: sparse neural search - SPLADE is a sparse neural network for efficient vector search.
Weaviate Hybrid Search - just a quick example hybrid search using LangChain and Weaviate.
Qdrant Hybrid Search - just a quick example hybrid search using LlamaIndex and Qdrant.
Building an Ecommerce-Based Search Application Using Langchain and Qdrant’s Latest Pure Vector-Based Hybrid Search - yet another example of the LangChain and Qdrant hybrid search using SPLADE.
Azure OpenAI demos - Azure OpenAI demos repository.
A complete Guide to LlamaIndex in 2024 - LlamaIndex blog post.
Building LLM-based Application Using Langchain and OpenAI - good text explanation of the LangChain and OpenAI integration.
LOTR (Merger Retriever) in LangChain - Merger Retriever is a LangChain retriever that merges results from multiple retrievers.
What is the difference between LOTR(Merger Retriever) and Ensemble Retriever - LOTR vs Ensemble Retriever issue.
TruLens - TruLens is an open-source package that provides instrumentation and evaluation tools for LLM based applications.
Better RAG with Merger Retriever (LOTR) and Re-ranking Retriever (Long Context Reorder)
Long-Context Reorder - LongContextReorder is a retriever that re-ranks the results of another retriever.
Lost in the Middle: How Language Models Use Long Contexts - LLMs and long contexts article.
MultiVector Retriever - MultiVectorRetriever is a retriever that uses multiple vector stores to retrieve results.
Harnessing Retrieval Augmented Generation With Langchain - RAG with LangChain article. Contains example with good and simple chat application.
LangChain Interface methods - LangChain interface methods. How to stream the output, how to asynchronously run the query, etc.
Save time and money by caching OpenAI (and other LLM) API calls with Langchain - LLM caching in LangChain.
LLM Caching integrations - LLM caching in LangChain.
LLM Caching - LLM caching in LangChain.
Multi-Vector Retriever for RAG on tables, text, and images - Multi-Vector Retriever for RAG on tables, text, and images.
LangChain Spark AI - LangChain and Spark integration.
Large Language Model for Table Processing: A Survey - LLM for table processing survey.
Sparse Vectors in Qdrant: Pure Vector-based Hybrid Search - how to add sparse vectors in Qdrant
Hybrid Search: SPLADE (Sparse Encoder) - more about SPLADE
Self-reflective RAG with LangGraph: Self-RAG and CRAG - Self-RAG and CRAG with LangGraph.
Webinar "A Whirlwind Tour of ML Model Serving Strategies (Including LLMs)"
Prompt-Engineering for Open-Source LLMs
Mitigating LLM Hallucinations with a Metrics-First Evaluation Framework
Efficient Fine-Tuning for Llama-v2-7b on a Single GPU

Courses