qdrant/fastembed

[Bug/Model Request]: Does this version support cuDNN 9.x and onnxruntime-gpu 1.18.1?

Opened this issue · 3 comments

What happened?

Problem
It seems that while onnxruntime-gpu is running with CUDA 12, when I try to use fastembed, it references CUDA 11, resulting in an error and not working.

[Environment]

  • fastembed-gpu: 0.3.4
  • onnxruntime-gpu: 1.18.1
  • CUDA: 12.2
  • cuDNN: 9.0.1

[Content]
I followed the installation page for onnxruntime-gpu (https://onnxruntime.ai/docs/install/) and downloaded the wheel as well. While onnxruntime-gpu works fine, when I attempt to use fastembed for embedding, I encounter the following issue:

[onnxruntime-gpu Operation Check Code]

import onnxruntime as ort
print(ort.version)
print(ort.get_device())
print(ort.get_available_providers())

Results:

  • 1.18.1
  • GPU
  • ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']

[Error Code]

from fastembed import SparseTextEmbedding, TextEmbedding
self.dense_model = TextEmbedding(model_name="intfloat/multilingual-e5-large")
self.sparse_model = SparseTextEmbedding(model_name="Qdrant/bm42-all-minilm-l6-v2-attentions")

Error:

[E:onnxruntime:Default, provider_bridge_ort.cc:1745 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory
2024-07-31 17:43:58.453516429 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:895 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.
Traceback (most recent call last):

It seems that fastembed is referencing CUDA 11 for some reason. Since onnxruntime-gpu appears to recognize the GPU without issues, I'm not sure what the problem is.

If there is any mistake in my setup, I apologize. Does anyone else have the same issue or know if there is a mistake in the setup process? Any help would be greatly appreciated.

What Python version are you on? e.g. python --version

Python 3.11.6

Version

0.2.7 (Latest)

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

No response

I have seen the same issue with Python 3.12.4 on Windows, using CUDA 12.5.

Any resolution for this? I have cuda 12.1, python 3.10, and onnxruntime-gpu 1.18.1. I can see the libonnxruntime_providers_cuda.so is still looking for libcublasLt.so.11, was this by design? Shouldn't this be working with cuda 12?

Running into this while using runpod, but works fine on Colab with the 1.17.1 workaround. @iKora128 any chance you were hitting this on Runpod as well?