[Feature Request] Allow GPU for query embedding

Question

[Feature Request] Allow GPU for query embedding

miikatoi opened this issue a year ago · 1 comments

Hi,

Really great and useful library. Thanks for making it available for everyone.

I am mostly applying this for quick evaluation of search models and realized that DenseRetriever is only applying GPU for the documents when building the index, but not for the queries when running search, which makes it a bit slow for larger sets of queries.

Would you consider adding use_gpu keyword argument to search, msearch and bsearch methods of DenseRetriever and HybridRetriever? Looks like it could be handled similarly as in the index method.

Just in case someone else is having the same issue, this problem can be avoided by directly setting the encoder device before running search as follows:

use_gpu = True

dr = dr.index(collection, use_gpu=use_gpu)

if use_gpu:
    dr.encoder.change_device('cuda')

r = dr.bsearch(queries=queries)
dr.encoder.change_device('cpu')

Thanks!

Answer 1 · 2023-07-07T07:34:36.000Z

Hi, thanks for the kind words!

I did not expose an option for using GPU at query time because autofaiss did not detect my GPU correctly. I think it would be inconsistent to have the query encoding on the GPU and the neighbor search on the CPU.

By changing the device of the encoder, you are only changing where the queries are encoded but not where the nearest neighbor search happens.

I want to make an overall of the dense retriever and closely look into the issue with autofaiss I had.
Unfortunately, I have been quite busy (or on vacation) lately. I hope to have the time to work on it soon.

Best,

Elias