Please support reranker API

Question

Please support reranker API

thiner opened this issue 9 months ago · 12 comments

Is your feature request related to a problem? Please describe.
Nowadays, embedding + reranker is the SOTA solution to improve the accuracy of RAG system. We've already have the embedding API support in LocalAI, it would be a big step forward if we can support reranker API.

Describe the solution you'd like
There are many reranker models out there, some famous names: bce-reranker-base_v1, CohereRerank, bge-reranker-v2-m3. I think the Jina reranker API would be a good format to implement. https://jina.ai/reranker/#apiform

Describe alternatives you've considered
n/a

Additional context
The benchmark regarding embedding+reranker for RAG:

Answer 1 · 2024-04-12T09:37:14.000Z

I spent some time trying to figure out how to implement. Below are my workouts:

Use reranker model is as easy as using embedding model. Below is the example of bce-reranker-base-v1.

from sentence_transformers import CrossEncoder

# init reranker model
model = CrossEncoder('maidalun1020/bce-reranker-base_v1', max_length=512)

# calculate scores of sentence pairs
scores = model.predict(sentence_pairs)

backend/python/sentencetransformers/sentencetransoformers.py supports embedding already, maybe add one more Rerank method with above code is enough. The major work would be refactor protobuf and API component.

Answer 2 · 2024-04-12T12:58:49.000Z

that's definetly a good addition - adding to our roadmap. Thanks also for pointing out the steps

Answer 3 · 2024-04-15T17:26:40.000Z

Thank you for your feature request, Thiner! We appreciate the details and the code example you provided to help illustrate the implementation. Adding reranker API support to LocalAI would indeed be a valuable improvement, especially considering the current state-of-the-art solutions for RAG systems.

To proceed with this feature request, we will evaluate the feasibility of incorporating a reranker API implementation similar to the example you provided, using a model like bce-reranker-base_v1 or other popular options. We will also research the best practices for integrating this functionality into LocalAI's existing architecture.

We will update the roadmap to include this feature request. Once we have completed our internal discussions and evaluations, we will provide an estimate of when this feature can be implemented, along with any additional details regarding the implementation.

Feel free to reach out if you have any further questions or concerns in the meantime. Thanks again for your suggestion and for helping us improve LocalAI!

Answer 4 · 2024-04-24T10:21:32.000Z

as a possible more configurable approach we may benefit from the project https://github.com/AnswerDotAI/rerankers

Answer 5 · 2024-04-24T16:44:56.000Z

having a quick look at this - let's see if can get something working before the weekend

Answer 6 · 2024-04-24T17:11:37.000Z

ok that was easy enough - going to open up a PR soon

Answer 7 · 2024-04-24T17:33:02.000Z

@thiner @BradKML will be part of the next release - if you want to test out before that, PR is at #2121

Answer 8 · 2024-04-25T09:23:29.000Z

that is a very good news. thank you very much

Answer 9 · 2024-04-28T05:42:09.000Z

~~@mudler How can I attach the reranker feature to cublas-cuda12-core image? I tried to do so in Dockerfile: RUN make BUILD_TYPE=cublas -C backend/python/rerankers, but failed due to below error:~~

0.189 make: Entering directory '/build/backend/python/rerankers'
0.189 python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
0.189 make: Leaving directory '/build/backend/python/rerankers'
0.189 /opt/conda/bin/python3: Error while finding module specification for 'grpc_tools.protoc' (ModuleNotFoundError: No module named 'grpc_tools')
0.189 make: *** [Makefile:27: backend_pb2.py] Error 1

~~What should I do to fix?~~

Please forgive me for my lazy thinking, the solution is quite straightforward, pip install grpcio-tools.

Answer 10 · 2024-04-28T09:42:16.000Z

@mudler thank you for the cross-link references since I was mostly focused on LiteLLM and Ollama for maximizing compatibility, but knowing that LocalAI is "getting there" is quite a relief

Answer 11 · 2024-04-28T10:27:53.000Z

~~@mudler How can I attach the reranker feature to cublas-cuda12-core image? I tried to do so in Dockerfile: RUN make BUILD_TYPE=cublas -C backend/python/rerankers, but failed due to below error:~~
0.189 make: Entering directory '/build/backend/python/rerankers'
0.189 python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
0.189 make: Leaving directory '/build/backend/python/rerankers'
0.189 /opt/conda/bin/python3: Error while finding module specification for 'grpc_tools.protoc' (ModuleNotFoundError: No module named 'grpc_tools')
0.189 make: *** [Makefile:27: backend_pb2.py] Error 1
~~What should I do to fix?~~

Please forgive me for my lazy thinking, the solution is quite straightforward, pip install grpcio-tools.

I'd suggest you to use standard (non-core) images as those do not come with additional python dependencies. If you want to use the core images still, you can or either create a Dockerfile based on top of it and run the command to prepare the backend, or use EXTRA_BACKENDS, as outlined in the docs here:

https://localai.io/advanced/#extra-backends

So, for instance you can use it like this:

docker run --env EXTRA_BACKENDS="backend/python/rerankers" quay.io/go-skynet/local-ai:master-ffmpeg-core

and that should bring you up the needed python dependencies on startup.

Answer 12 · 2024-04-28T11:17:12.000Z

Yes, I did so. The docker image is created successfully by specifying the extra nackend. But I got the grpc error at runtime still. Is there any changes to grpc module in the v2.13.0 release? I built the autogptq image with this dockerfile previousely, and that's working.