qdrant/fastembed

Add explicit warning when a provider is requested and available, but still not set

joein opened this issue · 1 comments

What happened?

After an attempt to set CUDAExecutionProvider to TextEmbedding in colab, CUDAExecutionProvider was not set and silently switched to CPUExecutionProvider

However, CUDAExecutionProvider is available according to onnxruntime.get_available_providers()

Actually, there is a version conflict for one of the dependencies: libcublasLt.so, onnxruntime tried to find libcublasLt.so.11, however only libcublasLt.so.12 was available.

In order to find that error, it was required to launch the script with the following command:

!ONNX_MODE=True python3 -c 'import onnxruntime as ort; model = ort.InferenceSession("/tmp/fastembed_cache/models--qdrant--bge-small-en-v1.5-onnx-q/snapshots/8617ad28a45ac436181ff4b88181f34b272d4939/model_optimized.onnx", providers=["CUDAExecutionProvider"]); model.get_providers()'

The output was:

[E:onnxruntime:Default, provider_bridge_ort.cc:1546 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1209 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory

[W:onnxruntime:Default, onnxruntime_pybind_state.cc:861 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.

This error is actually bound to the incompatible version of onnxruntime-gpu and cuda

Colab uses cuda12, however the default version of cuda for onnxruntime-gpu is 11.8,
to fix it, onnxruntime-gpu should be installed via pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/

to be release in fastembed 0.2.8