rag-evaluation
There are 28 repositories under rag-evaluation topic.
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents
Marker-Inc-Korea/AutoRAG
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
Agenta-AI/agenta
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
vectara/open-rag-eval
Open source RAG evaluation package
LLAMATOR-Core/llamator
Framework for testing vulnerabilities of large language models (LLM).
oztrkoguz/RAG-Framework-Evaluation
This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.
ioannis-papadimitriou/rag-playground
A framework for systematic evaluation of retrieval strategies and prompt engineering in RAG systems, featuring an interactive chat interface for document analysis.
rostyslavshovak/RAG-Retrieval-Augmented-Generation
RAG Chatbot for Financial Analysis
simranjeet97/Learn_RAG_from_Scratch_LLM
Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python
shaadclt/EvalRAG
A comprehensive evaluation toolkit for assessing Retrieval-Augmented Generation (RAG) outputs using linguistic, semantic, and fairness metrics
fkapsahili/EntRAG
EntRAG - Enterprise RAG Benchmark
bluewave-labs/evalwise
EvalWise is a developer-friendly platform for LLM evaluation and red teaming that helps test AI models for safety, compliance, and performance issues
264Gaurav/medical-RAG-chatbot
A LangChain-based Retrieval-Augmented Generation (RAG) chatbot for medical data. Integrates with Gemini/Grok AI to deliver accurate, context-aware answers in healthcare and biomedical domains.
AnasAber/MLflow_with_RAG
Using MLflow to deploy your RAG pipeline, using LLamaIndex, Langchain and Ollama/HuggingfaceLLMs/Groq
Kaos599/BetterRAG
BetterRAG: Powerful RAG evaluation toolkit for LLMs. Measure, analyze, and optimize how your AI processes text chunks with precision metrics. Perfect for RAG systems, document processing, and embedding quality assessment.
sprakash21/aws-genai-rageval-bot
RAG Pipeline Evaluation and monitoring on AWS using RAGAS
Gian207/RAG-lego-like-component
Proposal for industry RAG evaluation: Generative Universal Evaluation of LLMs and Information retrieval
keitabroadwater/llm-eval-lab
A web sandbox for hands-on learning of LLM and RAG Evaluation
TajaKuzman/pandachat-rag-benchmark
PandaChat-RAG benchmark for evaluation of RAG systems on a non-synthetic Slovenian test dataset.
igorsuhinin/rag-pdf-qa
RAG-powered PDF QA system with self-reflection and multiple retrieval strategies (Stuff/Map Reduce/Refine). Includes monitoring via Langfuse & LangSmith and containerization with Docker
jhaayush2004/RAG-Evaluation
Different approaches to evaluate RAG !!!
marktr11/RAG-Pipeline-LLM-Evaluation
A basic RAG (Retrieval-Augmented Generation) implementation and evaluation methodology built with Python.
neomatrix369/AIE7-Cert-Challenge
AIE7: Certification Challenge
OranDanon/Gen-AI-Assignment
Home assignment featuring two AI projects: a Medical Q&A Bot for Israeli HMOs and a National Insurance Form Extractor. Built with Azure OpenAI to demonstrate practical GenAI implementation skills.
OranDanon/RAG-application
RAG Chatbot over pre-defined set of articles about LangChain