DocQA-Bench is a library for benchmarking document question-answering systems. It provides a flexible framework for evaluating various components of a QA pipeline, including document chunking, embedding generation, vector storage, question generation, and answer generation.
- Modular design allowing easy swapping of components
- Support for custom document preprocessing
- Integration with OpenAI's GPT models for question and answer generation
- Customizable chunking strategies
- Vector store integration for efficient similarity search
- F1 score evaluation for answer quality
- Asynchronous processing for improved performance
You can install DocQA-Bench using pip:
pip install docqa-bench
Check out the example.py file in the root directory for a demonstration of how to use docqa_bench. This example shows how to:
Set up the necessary components Create a benchmark Run the benchmark on preprocessed content Handle and display the results
To run the example:
python example.py
Here's a basic example of how to use DocQA-Bench:
import asyncio
from docqa_bench import (
Benchmark, PreprocessedDocument, SimpleChunker, OpenAIEmbedder,
ChromaStore, OpenAIQuestionGenerator, OpenAIAnswerGenerator, F1Evaluator
)
async def run_benchmark(content: str):
document = PreprocessedDocument(content)
chunker = SimpleChunker(chunk_size=1000, chunk_overlap=200)
embedder = OpenAIEmbedder("text-embedding-ada-002")
vector_store = ChromaStore("benchmark_collection")
question_generator = OpenAIQuestionGenerator("gpt-4o-mini")
answer_generator = OpenAIAnswerGenerator("gpt-4o-mini")
evaluator = F1Evaluator()
benchmark = Benchmark(
document, chunker, embedder, vector_store,
question_generator, answer_generator, evaluator
)
results = await benchmark.run()
print(results)
asyncio.run(run_benchmark("Your document content here"))
DocQA-Bench consists of several modular components:
- Document: Represents the input document. Use PreprocessedDocument for already processed text.
- Chunker: Splits the document into manageable pieces. SimpleChunker is provided out of the box.
- Embedder: Generates vector representations of text. OpenAIEmbedder uses OpenAI's embedding model.
- VectorStore: Stores and retrieves vector representations. ChromaStore is provided as a default implementation.
- QuestionGenerator: Generates questions based on the document content. OpenAIQuestionGenerator uses OpenAI's GPT model.
- AnswerGenerator: Generates answers to questions. OpenAIAnswerGenerator uses OpenAI's GPT model.
- Evaluator: Evaluates the quality of generated answers. F1Evaluator is provided as a default implementation.
You can create custom implementations of any component by subclassing the respective base classes in the docqa_bench.core module.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License.
If you encounter any problems or have any questions, please open an issue on the GitHub repository.