/RAG_Techniques

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.

Primary LanguageJupyter NotebookOtherNOASSERTION

PRs Welcome LinkedIn Twitter Discord

🌟 Support This Project: Your sponsorship fuels innovation in RAG technologies. Become a sponsor to help maintain and expand this valuable resource!

Advanced RAG Techniques: Elevating Your Retrieval-Augmented Generation Systems πŸš€

Welcome to one of the most comprehensive and dynamic collections of Retrieval-Augmented Generation (RAG) tutorials available today. This repository serves as a hub for cutting-edge techniques aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems.

πŸ“« Stay Updated!

Don't miss out on cutting-edge developments, new tutorials, and community insights!

Subscribe to the RAG Techniques Newsletter of DiamantAI

Introduction

Retrieval-Augmented Generation (RAG) is revolutionizing the way we combine information retrieval with generative AI. This repository showcases a curated collection of advanced techniques designed to supercharge your RAG systems, enabling them to deliver more accurate, contextually relevant, and comprehensive responses.

Our goal is to provide a valuable resource for researchers and practitioners looking to push the boundaries of what's possible with RAG. By fostering a collaborative environment, we aim to accelerate innovation in this exciting field.

Related Projects

πŸ–‹οΈ Check out my Prompt Engineering Techniques guide for a comprehensive collection of prompting strategies, from basic concepts to advanced techniques, enhancing your ability to interact effectively with AI language models.

πŸ€– Explore my GenAI Agents Repository to discover a variety of AI agent implementations and tutorials, showcasing how different AI technologies can be combined to create powerful, interactive systems.

A Community-Driven Knowledge Hub

This repository grows stronger with your contributions! Join our vibrant Discord community β€” the central hub for shaping and advancing this project together 🀝

RAG Techniques Discord Community

Whether you're an expert or just starting out, your insights can shape the future of RAG. Join us to propose ideas, get feedback, and collaborate on innovative techniques. For contribution guidelines, please refer to our CONTRIBUTING.md file. Let's advance RAG technology together!

πŸ”— For discussions on GenAI, RAG, or custom agents, or to explore knowledge-sharing opportunities, feel free to connect on LinkedIn.

Key Features

  • 🧠 State-of-the-art RAG enhancements
  • πŸ“š Comprehensive documentation for each technique
  • πŸ› οΈ Practical implementation guidelines
  • 🌟 Regular updates with the latest advancements

Advanced Techniques

Explore the extensive list of cutting-edge RAG techniques:

🌱 Foundational RAG Techniques

  1. Simple RAG 🌱

    Overview πŸ”Ž

    Introducing basic RAG techniques ideal for newcomers.

    Implementation πŸ› οΈ

    Start with basic retrieval queries and integrate incremental learning mechanisms.

  2. Simple RAG using a CSV file 🧩

    Overview πŸ”Ž

    Introducing basic RAG using CSV files.

    Implementation πŸ› οΈ

    This uses CSV files to create basic retrieval and integrates with openai to create question and answering system.

  3. Reliable RAG 🏷️

    Overview πŸ”Ž

    Enhances the Simple RAG by adding validation and refinement to ensure the accuracy and relevance of retrieved information.

    Implementation πŸ› οΈ

    Check for retrieved document relevancy and highlight the segment of docs used for answering.

  4. Choose Chunk Size πŸ“

    Overview πŸ”Ž

    Selecting an appropriate fixed size for text chunks to balance context preservation and retrieval efficiency.

    Implementation πŸ› οΈ

    Experiment with different chunk sizes to find the optimal balance between preserving context and maintaining retrieval speed for your specific use case.

  5. Proposition Chunking ⛓️‍πŸ’₯

    Overview πŸ”Ž

    Breaking down the text into concise, complete, meaningful sentences allowing for better control and handling of specific queries (especially extracting knowledge).

    Implementation πŸ› οΈ

    • πŸ’ͺ Proposition Generation: The LLM is used in conjunction with a custom prompt to generate factual statements from the document chunks.
    • βœ… Quality Checking: The generated propositions are passed through a grading system that evaluates accuracy, clarity, completeness, and conciseness.

Additional Resources πŸ“š

πŸ” Query Enhancement

  1. Query Transformations πŸ”„

    Overview πŸ”Ž

    Modifying and expanding queries to improve retrieval effectiveness.

    Implementation πŸ› οΈ

    • ✍️ Query Rewriting: Reformulate queries to improve retrieval.
    • πŸ”™ Step-back Prompting: Generate broader queries for better context retrieval.
    • 🧩 Sub-query Decomposition: Break complex queries into simpler sub-queries.
  2. Hypothetical Questions (HyDE Approach) ❓

    Overview πŸ”Ž

    Generating hypothetical questions to improve alignment between queries and data.

    Implementation πŸ› οΈ

    Create hypothetical questions that point to relevant locations in the data, enhancing query-data matching.

    Additional Resources πŸ“š

πŸ“š Context and Content Enrichment

  1. Contextual Chunk Headers 🏷️

    Overview πŸ”Ž

    Contextual chunk headers (CCH) is a method of creating document-level and section-level context, and prepending those chunk headers to the chunks prior to embedding them.

    Implementation πŸ› οΈ

    Create a chunk header that includes context about the document and/or section of the document, and prepend that to each chunk in order to improve the retrieval accuracy.

    Additional Resources πŸ“š

    dsRAG: open-source retrieval engine that implements this technique (and a few other advanced RAG techniques)

  2. Relevant Segment Extraction 🧩

    Overview πŸ”Ž

    Relevant segment extraction (RSE) is a method of dynamically constructing multi-chunk segments of text that are relevant to a given query.

    Implementation πŸ› οΈ

    Perform a retrieval post-processing step that analyzes the most relevant chunks and identifies longer multi-chunk segments to provide more complete context to the LLM.

  3. Context Enrichment Techniques πŸ“

Overview πŸ”Ž

Enhancing retrieval accuracy by embedding individual sentences and extending context to neighboring sentences.

Implementation πŸ› οΈ

Retrieve the most relevant sentence while also accessing the sentences before and after it in the original text.

  1. Semantic Chunking 🧠

Overview πŸ”Ž

Dividing documents based on semantic coherence rather than fixed sizes.

Implementation πŸ› οΈ

Use NLP techniques to identify topic boundaries or coherent sections within documents for more meaningful retrieval units.

Additional Resources πŸ“š

  1. Contextual Compression πŸ—œοΈ

Overview πŸ”Ž

Compressing retrieved information while preserving query-relevant content.

Implementation πŸ› οΈ

Use an LLM to compress or summarize retrieved chunks, preserving key information relevant to the query.

  1. Document Augmentation through Question Generation for Enhanced Retrieval

Overview πŸ”Ž

This implementation demonstrates a text augmentation technique that leverages additional question generation to improve document retrieval within a vector database. By generating and incorporating various questions related to each text fragment, the system enhances the standard retrieval process, thus increasing the likelihood of finding relevant documents that can be utilized as context for generative question answering.

Implementation πŸ› οΈ

Use an LLM to augment text dataset with all possible questions that can be asked to each document.

πŸš€ Advanced Retrieval Methods

  1. Fusion Retrieval πŸ”—

    Overview πŸ”Ž

    Optimizing search results by combining different retrieval methods.

    Implementation πŸ› οΈ

    Combine keyword-based search with vector-based search for more comprehensive and accurate retrieval.

  2. Intelligent Reranking πŸ“ˆ

    Overview πŸ”Ž

    Applying advanced scoring mechanisms to improve the relevance ranking of retrieved results.

    Implementation πŸ› οΈ

    • 🧠 LLM-based Scoring: Use a language model to score the relevance of each retrieved chunk.
    • πŸ”€ Cross-Encoder Models: Re-encode both the query and retrieved documents jointly for similarity scoring.
    • πŸ† Metadata-enhanced Ranking: Incorporate metadata into the scoring process for more nuanced ranking.

    Additional Resources πŸ“š

  3. Multi-faceted Filtering πŸ”

    Overview πŸ”Ž

    Applying various filtering techniques to refine and improve the quality of retrieved results.

    Implementation πŸ› οΈ

    • 🏷️ Metadata Filtering: Apply filters based on attributes like date, source, author, or document type.
    • πŸ“Š Similarity Thresholds: Set thresholds for relevance scores to keep only the most pertinent results.
    • πŸ“„ Content Filtering: Remove results that don't match specific content criteria or essential keywords.
    • 🌈 Diversity Filtering: Ensure result diversity by filtering out near-duplicate entries.
  4. Hierarchical Indices πŸ—‚οΈ

    Overview πŸ”Ž

    Creating a multi-tiered system for efficient information navigation and retrieval.

    Implementation πŸ› οΈ

    Implement a two-tiered system for document summaries and detailed chunks, both containing metadata pointing to the same location in the data.

    Additional Resources πŸ“š

  5. Ensemble Retrieval 🎭

    Overview πŸ”Ž

    Combining multiple retrieval models or techniques for more robust and accurate results.

    Implementation πŸ› οΈ

    Apply different embedding models or retrieval algorithms and use voting or weighting mechanisms to determine the final set of retrieved documents.

  6. Multi-modal Retrieval πŸ“½οΈ

    Overview πŸ”Ž

    Extending RAG capabilities to handle diverse data types for richer responses.

    Implementation πŸ› οΈ

πŸ” Iterative and Adaptive Techniques

  1. Retrieval with Feedback Loops πŸ”

    Overview πŸ”Ž

    Implementing mechanisms to learn from user interactions and improve future retrievals.

    Implementation πŸ› οΈ

    Collect and utilize user feedback on the relevance and quality of retrieved documents and generated responses to fine-tune retrieval and ranking models.

  2. Adaptive Retrieval 🎯

    Overview πŸ”Ž

    Dynamically adjusting retrieval strategies based on query types and user contexts.

    Implementation πŸ› οΈ

    Classify queries into different categories and use tailored retrieval strategies for each, considering user context and preferences.

  3. Iterative Retrieval πŸ”„

    Overview πŸ”Ž

    Performing multiple rounds of retrieval to refine and enhance result quality.

    Implementation πŸ› οΈ

    Use the LLM to analyze initial results and generate follow-up queries to fill in gaps or clarify information.

πŸ“Š Evaluation

  1. DeepEval Evaluation πŸ“˜

    Overview πŸ”Ž

    Performing evaluations Retrieval-Augmented Generation systems, by covering several metrics and creating test cases.

    Implementation πŸ› οΈ

    Use the deepeval library to conduct test cases on correctness, faithfulness and contextual relevancy of RAG systems.

  2. GroUSE Evaluation 🐦

    Overview πŸ”Ž

    Evaluate the final stage of Retrieval-Augmented Generation using metrics of the GroUSE framework and meta-evaluate your custom LLM judge on GroUSE unit tests.

    Implementation πŸ› οΈ

    Use the grouse package to evaluate contextually-grounded LLM generations with GPT-4 on the 6 metrics of the GroUSE framework and use unit tests to evaluate a custom Llama 3.1 405B evaluator.

πŸ”¬ Explainability and Transparency

  1. Explainable Retrieval πŸ”

    Overview πŸ”Ž

    Providing transparency in the retrieval process to enhance user trust and system refinement.

    Implementation πŸ› οΈ

    Explain why certain pieces of information were retrieved and how they relate to the query.

πŸ—οΈ Advanced Architectures

  1. Knowledge Graph Integration (Graph RAG) πŸ•ΈοΈ

    Overview πŸ”Ž

    Incorporating structured data from knowledge graphs to enrich context and improve retrieval.

    Implementation πŸ› οΈ

    Retrieve entities and their relationships from a knowledge graph relevant to the query, combining this structured data with unstructured text for more informative responses.

  2. GraphRag (Microsoft) 🎯

    Overview πŸ”Ž

    Microsoft GraphRAG (Open Source) is an advanced RAG system that integrates knowledge graphs to improve the performance of LLMs

    Implementation πŸ› οΈ

    β€’ Analyze an input corpus by extracting entities, relationshipsfrom text units. generates summaries of each community and its constituents from the bottom-up.

  3. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval 🌳

    Overview πŸ”Ž

    Implementing a recursive approach to process and organize retrieved information in a tree structure.

    Implementation πŸ› οΈ

    Use abstractive summarization to recursively process and summarize retrieved documents, organizing the information in a tree structure for hierarchical context.

  4. Self RAG πŸ”

    Overview πŸ”Ž

    A dynamic approach that combines retrieval-based and generation-based methods, adaptively deciding whether to use retrieved information and how to best utilize it in generating responses.

    Implementation πŸ› οΈ

    β€’ Implement a multi-step process including retrieval decision, document retrieval, relevance evaluation, response generation, support assessment, and utility evaluation to produce accurate, relevant, and useful outputs.

  5. Corrective RAG πŸ”§

    Overview πŸ”Ž

    A sophisticated RAG approach that dynamically evaluates and corrects the retrieval process, combining vector databases, web search, and language models for highly accurate and context-aware responses.

    Implementation πŸ› οΈ

    β€’ Integrate Retrieval Evaluator, Knowledge Refinement, Web Search Query Rewriter, and Response Generator components to create a system that adapts its information sourcing strategy based on relevance scores and combines multiple sources when necessary.

🌟 Special Advanced Technique 🌟

  1. Sophisticated Controllable Agent for Complex RAG Tasks πŸ€–

    Overview πŸ”Ž

    An advanced RAG solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve. This approach uses a sophisticated deterministic graph as the "brain" 🧠 of a highly controllable autonomous agent, capable of answering non-trivial questions from your own data.

    Implementation πŸ› οΈ

    β€’ Implement a multi-step process involving question anonymization, high-level planning, task breakdown, adaptive information retrieval and question answering, continuous re-planning, and rigorous answer verification to ensure grounded and accurate responses.

Getting Started

To begin implementing these advanced RAG techniques in your projects:

  1. Clone this repository:
    git clone https://github.com/NirDiamant/RAG_Techniques.git
    
  2. Navigate to the technique you're interested in:
    cd all_rag_techniques/technique-name
    
  3. Follow the detailed implementation guide in each technique's directory.

Contributing

We welcome contributions from the community! If you have a new technique or improvement to suggest:

  1. Fork the repository
  2. Create your feature branch: git checkout -b feature/AmazingFeature
  3. Commit your changes: git commit -m 'Add some AmazingFeature'
  4. Push to the branch: git push origin feature/AmazingFeature
  5. Open a pull request

Contributors

Contributors

License

This project is licensed under a custom non-commercial license - see the LICENSE file for details.


⭐️ If you find this repository helpful, please consider giving it a star!

Keywords: RAG, Retrieval-Augmented Generation, NLP, AI, Machine Learning, Information Retrieval, Natural Language Processing, LLM, Embeddings, Semantic Search