/QAnything

Question and Answer based on Anything.

Primary LanguagePythonApache License 2.0Apache-2.0

Logo

Question and Answer based on Anything

English | 简体中文

         

              

    

Table of Contents

What is QAnything?

QAnything(Question and Answer based on Anything) is a local knowledge base question-answering system designed to support a wide range of file formats and databases, allowing for offline installation and use.

With QAnything, you can simply drop any locally stored file of any format and receive accurate, fast, and reliable answers.

Currently supported formats include: PDF, Word (doc/docx), PPT, Markdown, Eml, TXT, Images (jpg, png, etc.), Web links and more formats coming soon…

Key features

  • Data Security, supports installation and usage with network cable unplugged throughout the process.
  • Cross-language QA support, freely switch between Chinese and English QA, regardless of the language of the document.
  • Supports massive data QA, two-stage retrieval ranking, solving the degradation problem of large-scale data retrieval; the more data, the better the performance.
  • High-performance production-grade system, directly deployable for enterprise applications.
  • User-friendly, no need for cumbersome configurations, one-click installation and deployment, ready to use.
  • Multi knowledge base QA Support selecting multiple knowledge bases for Q&A

Architecture

qanything_system

Why 2 stage retrieval?

In scenarios with a large volume of knowledge base data, the advantages of a two-stage approach are very clear. If only a first-stage embedding retrieval is used, there will be a problem of retrieval degradation as the data volume increases, as indicated by the green line in the following graph. However, after the second-stage reranking, there can be a stable increase in accuracy, the more data, the better the performance.

two stage retrievaal

QAnything uses the retrieval component BCEmbedding, which is distinguished for its bilingual and crosslingual proficiency. BCEmbedding excels in bridging Chinese and English linguistic gaps, which achieves

1st Retrieval(embedding)

Model Retrieval STS PairClassification Classification Reranking Clustering Avg
bge-base-en-v1.5 37.14 55.06 75.45 59.73 43.05 37.74 47.20
bge-base-zh-v1.5 47.60 63.72 77.40 63.38 54.85 32.56 53.60
bge-large-en-v1.5 37.15 54.09 75.00 59.24 42.68 37.32 46.82
bge-large-zh-v1.5 47.54 64.73 79.14 64.19 55.88 33.26 54.21
jina-embeddings-v2-base-en 31.58 54.28 74.84 58.42 41.16 34.67 44.29
m3e-base 46.29 63.93 71.84 64.08 52.38 37.84 53.54
m3e-large 34.85 59.74 67.69 60.07 48.99 31.62 46.78
bce-embedding-base_v1 57.60 65.73 74.96 69.00 57.29 38.95 59.43

2nd Retrieval(rerank)

Model Reranking Avg
bge-reranker-base 57.78 57.78
bge-reranker-large 59.69 59.69
bce-reranker-base_v1 60.06 60.06

RAG Evaluations in LlamaIndex(embedding and rerank)

NOTE:

  • In WithoutReranker setting, our bce-embedding-base_v1 outperforms all the other embedding models.
  • With fixing the embedding model, our bce-reranker-base_v1 achieves the best performance.
  • The combination of bce-embedding-base_v1 and bce-reranker-base_v1 is SOTA.
  • If you want to use embedding and rerank separately, please refer to BCEmbedding

LLM

The open source version of QAnything is based on QwenLM and has been fine-tuned on a large number of professional question-answering datasets. It greatly enhances the ability of question-answering. If you need to use it for commercial purposes, please follow the license of QwenLM. For more details, please refer to: QwenLM

Before You Start

Star us on GitHub, and be instantly notified for new release! star_us

Getting Started

Prerequisites

For Linux

System Required item Minimum Requirement Note
Linux Single NVIDIA GPU Memory
or Double NVIDIA GPU Memory
>= 16GB
>= 11GB + 5G
NVIDIA 3090 x 1 recommended
NVIDIA 2080TI × 2 recommended
NVIDIA Driver Version >= 525.105.17
CUDA Version >= 12.0
Docker version >= 20.10.5 Docker install
docker compose version >= 2.23.3 docker compose install

For Winodws 11 with WSL 2

System Required item Minimum Requirement Note
Windows 11 with WSL 2 Single NVIDIA GPU Memory
or Double NVIDIA GPU Memory
>= 16GB
>= 11GB + 5G
NVIDIA 3090
NVIDIA 2080TI × 2
GEFORCE EXPERIENCE >= 546.33 GEFORCE EXPERIENCE download
Docker Desktop >= 4.26.1(131620) Docker Desktop for Windows

Installation

step1: pull qanything repository

git clone https://github.com/netease-youdao/QAnything.git

step2: Enter the project root directory and execute the startup script.

If you are in the Windows11 system: Need to enter the WSL environment.

cd QAnything
bash run.sh  # Start on GPU 0 by default.
(Optional) Specify GPU startup
cd QAnything
bash ./run.sh -c local -i 0 -b default  # gpu id 0
(Optional) Specify multi-GPU startup
cd QAnything
bash ./run.sh -c local -i 0,1 -b default  # gpu ids: 0,1, Please confirm how many GPUs are available. Supports up to two cards for startup. 

step3: start to experience

Front end

After successful installation, you can experience the application by entering the following addresses in your web browser.

  • Front end address: http://your_host:5052/qanything/

API

If you want to visit API, please refer to the following address:

Close service

If you are in the Windows11 system: Need to enter the WSL environment.

bash close.sh

FAQ

FAQ

Usage

Cross-lingual: Multiple English paper Q&A

multi_paper_qa.mp4

Information extraction

information_extraction.mp4

Various files

various_files_qa.mp4

Web Q&A

web_qa.mp4

API Document

If you need to access the API, please refer to the QAnything API documentation.

Community & Support

Discord

Welcome to the QAnything Discord community

WeChat Group

Welcome to scan the QR code below and join the WeChat group.

Email

If you need to contact our team privately, please reach out to us via the following email:

qanything@rd.netease.com

GitHub issues

Reach out to the maintainer at one of the following places:

Star History

Star History Chart

License

QAnything is licensed under Apache 2.0 License

Acknowledgments

QAnything adopts dependencies from the following:

  • Thanks to our BCEmbedding for the excellent embedding and rerank model.
  • Thanks to Qwen for strong base language models.
  • Thanks to Triton Inference Server for providing great open source inference serving.
  • Thanks to FastChat for providing a fully OpenAI-compatible API server.
  • Thanks to FasterTransformer and vllm for highly optimized LLM inference backend.
  • Thanks to Langchain for the wonderful llm application framework.
  • Thanks to Langchain-Chatchat for the inspiration provided on local knowledge base Q&A.
  • Thanks to Milvus for the excellent semantic search library.
  • Thanks to PaddleOCR for its ease-to-use OCR library.
  • Thanks to Sanic for the powerful web service framework.