Please support reranker API
thiner opened this issue ยท 12 comments
Is your feature request related to a problem? Please describe.
Nowadays, embedding + reranker is the SOTA solution to improve the accuracy of RAG system. We've already have the embedding API support in LocalAI, it would be a big step forward if we can support reranker API.
Describe the solution you'd like
There are many reranker models out there, some famous names: bce-reranker-base_v1
, CohereRerank
, bge-reranker-v2-m3
. I think the Jina reranker API would be a good format to implement. https://jina.ai/reranker/#apiform
Describe alternatives you've considered
n/a
Additional context
The benchmark regarding embedding+reranker for RAG:
I spent some time trying to figure out how to implement. Below are my workouts:
- Use reranker model is as easy as using embedding model. Below is the example of
bce-reranker-base-v1
.
from sentence_transformers import CrossEncoder
# init reranker model
model = CrossEncoder('maidalun1020/bce-reranker-base_v1', max_length=512)
# calculate scores of sentence pairs
scores = model.predict(sentence_pairs)
backend/python/sentencetransformers/sentencetransoformers.py
supports embedding already, maybe add one moreRerank
method with above code is enough. The major work would be refactor protobuf and API component.
that's definetly a good addition - adding to our roadmap. Thanks also for pointing out the steps
Thank you for your feature request, Thiner! We appreciate the details and the code example you provided to help illustrate the implementation. Adding reranker API support to LocalAI would indeed be a valuable improvement, especially considering the current state-of-the-art solutions for RAG systems.
To proceed with this feature request, we will evaluate the feasibility of incorporating a reranker API implementation similar to the example you provided, using a model like bce-reranker-base_v1
or other popular options. We will also research the best practices for integrating this functionality into LocalAI's existing architecture.
We will update the roadmap to include this feature request. Once we have completed our internal discussions and evaluations, we will provide an estimate of when this feature can be implemented, along with any additional details regarding the implementation.
Feel free to reach out if you have any further questions or concerns in the meantime. Thanks again for your suggestion and for helping us improve LocalAI!
as a possible more configurable approach we may benefit from the project https://github.com/AnswerDotAI/rerankers
having a quick look at this - let's see if can get something working before the weekend
that is a very good news. thank you very much
@mudler How can I attach the reranker feature to cublas-cuda12-core
image? I tried to do so in Dockerfile: RUN make BUILD_TYPE=cublas -C backend/python/rerankers
, but failed due to below error:
0.189 make: Entering directory '/build/backend/python/rerankers'
0.189 python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
0.189 make: Leaving directory '/build/backend/python/rerankers'
0.189 /opt/conda/bin/python3: Error while finding module specification for 'grpc_tools.protoc' (ModuleNotFoundError: No module named 'grpc_tools')
0.189 make: *** [Makefile:27: backend_pb2.py] Error 1
What should I do to fix?
Please forgive me for my lazy thinking, the solution is quite straightforward, pip install grpcio-tools
.
@mudler thank you for the cross-link references since I was mostly focused on LiteLLM and Ollama for maximizing compatibility, but knowing that LocalAI is "getting there" is quite a relief
@mudler How can I attach the reranker feature tocublas-cuda12-core
image? I tried to do so in Dockerfile:RUN make BUILD_TYPE=cublas -C backend/python/rerankers
, but failed due to below error:0.189 make: Entering directory '/build/backend/python/rerankers' 0.189 python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto 0.189 make: Leaving directory '/build/backend/python/rerankers' 0.189 /opt/conda/bin/python3: Error while finding module specification for 'grpc_tools.protoc' (ModuleNotFoundError: No module named 'grpc_tools') 0.189 make: *** [Makefile:27: backend_pb2.py] Error 1
What should I do to fix?Please forgive me for my lazy thinking, the solution is quite straightforward,
pip install grpcio-tools
.
I'd suggest you to use standard (non-core) images as those do not come with additional python dependencies. If you want to use the core images still, you can or either create a Dockerfile based on top of it and run the command to prepare the backend, or use EXTRA_BACKENDS
, as outlined in the docs here:
https://localai.io/advanced/#extra-backends
So, for instance you can use it like this:
docker run --env EXTRA_BACKENDS="backend/python/rerankers" quay.io/go-skynet/local-ai:master-ffmpeg-core
and that should bring you up the needed python dependencies on startup.
Yes, I did so. The docker image is created successfully by specifying the extra nackend. But I got the grpc error at runtime still. Is there any changes to grpc module in the v2.13.0 release? I built the autogptq image with this dockerfile previousely, and that's working.