cuda is available but import bnb error
ZeroneBo opened this issue · 2 comments
ZeroneBo commented
System Info
CentOS Linux release 7.8.2003 (Core)
NVIDIA A100-PCIE-40GB 1gpu
NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.6.68
nvidia-nvtx-cu12 12.1.105
torch 2.4.1
triton 3.0.0
trl 0.9.6
transformers 4.44.2
bitsandbytes==0.44.0.dev0, also tried 0.39.0, etc.
Could not find the bitsandbytes CUDA binary at PosixPath('/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/libbitsandbytes_cuda121.so')
Could not load bitsandbytes native library: /public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/cextension.py", line 104, in <module>
lib = get_native_library()
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/cextension.py", line 91, in get_native_library
dll = ct.cdll.LoadLibrary(str(binary_path))
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
return self._dlltype(name)
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory
CUDA Setup failed despite CUDA being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
The directory listed in your path is found to be non-existent: /opt/gridview/slurm/lib64
The directory listed in your path is found to be non-existent: /usr/local/cuda/lib
The directory listed in your path is found to be non-existent: /opt/gridview/slurm/lib64
The directory listed in your path is found to be non-existent: /usr/local/cuda/lib
The directory listed in your path is found to be non-existent: /opt/gridview/slurm/lib64
The directory listed in your path is found to be non-existent: /opt/gridview/clusquota/man
The directory listed in your path is found to be non-existent: /opt/gridview/clusquota/man
The directory listed in your path is found to be non-existent: /opt/gridview/clusquota/man
The directory listed in your path is found to be non-existent: /public/home/sb/perl5/lib/perl5
The directory listed in your path is found to be non-existent: --install_base /public/home/sb/perl5
The directory listed in your path is found to be non-existent: /public/home/sb/.vscode-server/cli/servers/Stable-89de5a8d4d6205e5b11647eb6a74844ca23d2573/server/extensions/git/dist/askpass-main.js
The directory listed in your path is found to be non-existent: /opt/clusconf
The directory listed in your path is found to be non-existent: /run/user/2014/vscode-git-80b887459b.sock
The directory listed in your path is found to be non-existent: /run/user/2014/vscode-ipc-f1d5091b-075d-4c0b-8252-1bbfe64ebf5a.sock
The directory listed in your path is found to be non-existent: /public/home/sb/.vscode-server/cli/servers/Stable-89de5a8d4d6205e5b11647eb6a74844ca23d2573/server/bin/helpers/browser.sh
The directory listed in your path is found to be non-existent: /public/home/sb/.vscode-server/cli/servers/Stable-89de5a8d4d6205e5b11647eb6a74844ca23d2573/server/node
The directory listed in your path is found to be non-existent: /public/home/sb/.vscode-server/cli/servers/Stable-89de5a8d4d6205e5b11647eb6a74844ca23d2573/server/extensions/git/dist/askpass.sh
The directory listed in your path is found to be non-existent: INSTALL_BASE=/public/home/sb/perl5
Traceback (most recent call last):
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/diagnostics/main.py", line 66, in main
sanity_check()
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/diagnostics/main.py", line 40, in sanity_check
adam.step()
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/torch/optim/optimizer.py", line 484, in wrapper
out = func(*args, **kwargs)
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/optim/optimizer.py", line 287, in step
self.update_step(group, p, gindex, pindex)
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/optim/optimizer.py", line 500, in update_step
F.optimizer_update_32bit(
File "/public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/functional.py", line 1604, in optimizer_update_32bit
optim_func = str2optimizer32bit[optimizer_name][0]
NameError: name 'str2optimizer32bit' is not defined
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
CUDA specs: CUDASpecs(highest_compute_capability=(8, 0), cuda_version_string='121', cuda_version_tuple=(12, 1))
PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: (8, 0).
Library not found: /public/home/sb/anaconda3/envs/ft/lib/python3.10/site-packages/bitsandbytes-0.44.0.dev0-py3.10-linux-x86_64.egg/bitsandbytes/libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION`,
for example, `make CUDA_VERSION=113`.
The CUDA version for the compile might depend on your conda install, if using conda.
Inspect CUDA version via `conda list | grep cuda`.
To manually override the PyTorch CUDA version please see: https://github.com/TimDettmers/bitsandbytes/blob/main/docs/source/nonpytorchcuda.mdx
Found duplicate CUDA runtime files (see below).
We select the PyTorch default CUDA runtime, which is 12.1,
but this might mismatch with the CUDA version that is needed for bitsandbytes.
To override this behavior set the `BNB_CUDA_VERSION=<version string, e.g. 122>` environmental variable.
For example, if you want to use the CUDA version 122,
BNB_CUDA_VERSION=122 python ...
OR set the environmental variable in your .bashrc:
export BNB_CUDA_VERSION=122
In the case of a manual override, make sure you set LD_LIBRARY_PATH, e.g.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2,
* Found CUDA runtime at: /usr/local/cuda/lib64/libcudart.so
* Found CUDA runtime at: /usr/local/cuda/lib64/libcudart.so.12
* Found CUDA runtime at: /usr/local/cuda/lib64/libcudart.so.12.2.140
* Found CUDA runtime at: /usr/local/cuda/lib64/libcudart.so
* Found CUDA runtime at: /usr/local/cuda/lib64/libcudart.so.12
* Found CUDA runtime at: /usr/local/cuda/lib64/libcudart.so.12.2.140
* Found CUDA runtime at: /public/home/sb/anaconda3/envs/ft/lib/libcudart.so
* Found CUDA runtime at: /public/home/sb/anaconda3/envs/ft/lib/libcudart.so.12.6.68
* Found CUDA runtime at: /public/home/sb/anaconda3/envs/ft/lib/libcudart.so.12
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and CUDA is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.
For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=cuda -S .`.
See the documentation for more details if needed.
Trying a simple check anyway, but this will likely fail...
Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.
Reproduction
python -m bitsandbytes
Expected behavior
I can use cuda with torch and transformers but not bnb, I want to use bnb without error.
tanvisharma commented
I faced similar issue.
You could try building and installing from the source. Then, add the corresponding /lib and /bin to LD_LIBRARY_PATH and PATH respectively.
ZeroneBo commented
Hi @tanvisharma, thank you for your reply, I have tried install from https://huggingface.co/docs/bitsandbytes/main/en/installation?backend=Apple+Silicon+%28MPS%29&source=Linux#installation
with the code:
git clone https://github.com/TimDettmers/bitsandbytes.git && cd bitsandbytes/
pip install -r requirements-dev.txt
and
bash install_cuda.sh 121 ~/cuda-121
export BNB_CUDA_VERSION=121
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/YOUR_USERNAME/local/cuda-11.7
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/cuda-121/cuda-12.1/lib64
export PATH=~/cuda-121/cuda-12.1/bin
then pip install -e .
, but it seems report the same error.
Could you provide a more detailed solution?