bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
PythonMIT
Issues
- 0
too large numeric difference with pytorch inference
#1396 opened by weixsong - 1
CUDA Setup failed despite GPU being available
#1394 opened by carlxc911 - 0
I ran a NF4 72B model in 2xA6000 using llamafactory
#1392 opened by charleswg - 0
CUDA Architecture 80+ Causing Incorrect Model Behavior with BitsAndBytes Quantization
#1391 opened by gunjunlee - 0
ARM Runners December 2024
#1390 opened by johnnynunez - 3
Release v44 not available for Mac
#1378 opened by ACMCMC - 2
where are the outliers stored in LLM.int8 quantization for inference suing transformers library on AMD GPU?
#1320 opened by vbayanag - 0
- 0
Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
#1387 opened by smillpine - 0
- 1
AdEMA NaN when loading from state_dict
#1382 opened by darius-lam - 12
FLUTE Integration for Fast Inference
#1293 opened by HanGuo97 - 1
Paged optimizer resuming from checkpoint - attributeError: 'int' object has no attribute 'cpu'
#1381 opened by shivam15s - 0
Python 3.9 support broken in 0.44.0
#1376 opened by Benzhaomin - 0
Model architecture is modified when I use BitsAndBytesConfig with default params
#1371 opened by yunhao-tech - 2
- 4
- 2
cuda is available but import bnb error
#1355 opened by ZeroneBo - 4
Merge LoRA into 405B
#1359 opened by junzhang-zj - 1
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmEx
#1363 opened by LukeLIN-web - 0
Bug when using optimizer LAMB 32bits
#1350 opened by FrsECM - 1
Lion Optimizer With Triton Kernel
#1356 opened by lapp0 - 0
Model not able to quantize
#1354 opened by alielfilali01 - 2
libcudart.so Not Found
#1313 opened by arunsandy1309 - 0
Torch autograd support for dequantize methods
#1347 opened by yaldashbz - 1
Cannot load decoder.lm_head.weight when loading 4 bit quantized model using VisionEncoderDecoder.from_pretrained
#1343 opened by AditiJain14 - 0
- 6
Pretrained Causal LM cannot be loaded in 4bit/8bit
#1331 opened by adrienchaton - 4
Any plan to support block size 32?
#1329 opened by lllyasviel - 0
Linear8bitLt can not be moved back to cpu
#1332 opened by Nerogar - 1
- 1
'nf4' compute datatype?
#1321 opened by dorsa-zeinali - 1
RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
#1327 opened by pradeep10kumar - 1
Error while trying to install the multi-backend-refactor branch for rocm in WSL2
#1323 opened by Kademo15 - 1
RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information:
#1322 opened by pradeep10kumar - 1
- 3
NameError: name 'str2optimizer32bit' is not defined
#1281 opened by qingqinggu - 6
Communicate blocksize constraints to kernels that take blocksize as a runtime argument
#1317 opened by mm04926412 - 1
Runtime Error, cannot import name 'get_keys_to_not_convert' from 'transformers.integrations'
#1309 opened by zeruiz99 - 4
- 4
- 4
Unable to override PyTorch CUDA Version
#1315 opened by tinglvv - 2
Regarding bnb import error
#1306 opened by Mubashirshariq - 0
4bit quantized model.dequantize() fails on CPU
#1311 opened by npbool - 0
RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
#1307 opened by senzawapoi - 0
bitsbytes 8bit quantized LLama 3.1 gets stuck sometimes when producing output
#1304 opened by Techbhatia - 1
Clarifying the quantization algorithm
#1283 opened by chrisjmccormick - 0
> I encountered the same issue on CUDA 11.6 and fixed it by building bitsandbytes from source. Below is my bash script for reference:
#1297 opened by insafim - 1
CUDA Setup failed despite GPU being available
#1289 opened by Keertiraj - 0
Who owns bitsandbytes?
#1288 opened by garrettbyrd