hollowstrawberry/kohya-colab

Cuda not working

guy907223982 opened this issue · 25 comments

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 122
CUDA SETUP: TODO: compile library for specific version: libbitsandbytes_cuda122.so
CUDA SETUP: Defaulting to libbitsandbytes.so...
CUDA SETUP: CUDA detection failed. Either CUDA driver not installed, CUDA not installed, or you have multiple conflicting CUDA libraries!
CUDA SETUP: If you compiled from source, try again with make CUDA_VERSION=DETECTED_CUDA_VERSION for example, make CUDA_VERSION=113.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /content/kohya-trainer/train_network.py:873 in │
│ │
│ 870 │ args = parser.parse_args() │
│ 871 │ args = train_util.read_config_from_file(args, parser) │
│ 872 │ │
│ ❱ 873 │ train(args) │
│ 874 │
│ │
│ /content/kohya-trainer/train_network.py:262 in train │
│ │
│ 259 │ │ ) │
│ 260 │ │ trainable_params = network.prepare_optimizer_params(args.text_encoder_lr, args.u │
│ 261 │ │
│ ❱ 262 │ optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, trainable │
│ 263 │ │
│ 264 │ # dataloaderを準備する │
│ 265 │ # DataLoaderのプロセス数:0はメインプロセスになる │
│ │
│ /content/kohya-trainer/library/train_util.py:2700 in get_optimizer │
│ │
│ 2697 │ │
│ 2698 │ if optimizer_type == "AdamW8bit".lower(): │
│ 2699 │ │ try: │
│ ❱ 2700 │ │ │ import bitsandbytes as bnb │
│ 2701 │ │ except ImportError: │
│ 2702 │ │ │ raise ImportError("No bitsand bytes / bitsandbytesがインストールされていない │
│ 2703 │ │ print(f"use 8-bit AdamW optimizer | {optimizer_kwargs}") │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/init.py:6 in │
│ │
│ 3 # This source code is licensed under the MIT license found in the │
│ 4 # LICENSE file in the root directory of this source tree. │
│ 5 │
│ ❱ 6 from .autograd._functions import ( │
│ 7 │ MatmulLtState, │
│ 8 │ bmm_cublas, │
│ 9 │ matmul, │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/autograd/_functions.py:5 in │
│ │
│ 2 import warnings │
│ 3 │
│ 4 import torch │
│ ❱ 5 import bitsandbytes.functional as F │
│ 6 │
│ 7 from dataclasses import dataclass │
│ 8 from functools import reduce # Required in Python 3 │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/functional.py:13 in │
│ │
│ 10 from typing import Tuple │
│ 11 from torch import Tensor │
│ 12 │
│ ❱ 13 from .cextension import COMPILED_WITH_CUDA, lib │
│ 14 from functools import reduce # Required in Python 3 │
│ 15 │
│ 16 # math.prod not compatible with python < 3.8 │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:41 in │
│ │
│ 38 │ │ return cls._instance │
│ 39 │
│ 40 │
│ ❱ 41 lib = CUDALibrary_Singleton.get_instance().lib │
│ 42 try: │
│ 43 │ lib.cadam32bit_g32 │
│ 44 │ lib.get_context.restype = ct.c_void_p │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:37 in get_instance │
│ │
│ 34 │ def get_instance(cls): │
│ 35 │ │ if cls._instance is None: │
│ 36 │ │ │ cls._instance = cls.new(cls) │
│ ❱ 37 │ │ │ cls._instance.initialize() │
│ 38 │ │ return cls._instance │
│ 39 │
│ 40 │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:27 in initialize │
│ │
│ 24 │ │ │ if not binary_path.exists(): │
│ 25 │ │ │ │ print('CUDA SETUP: CUDA detection failed. Either CUDA driver not install │
│ 26 │ │ │ │ print('CUDA SETUP: If you compiled from source, try again with `make CUD │
│ ❱ 27 │ │ │ │ raise Exception('CUDA SETUP: Setup Failed!') │
│ 28 │ │ │ self.lib = ct.cdll.LoadLibrary(binary_path) │
│ 29 │ │ else: │
│ 30 │ │ │ print(f"CUDA SETUP: Loading binary {binary_path}...") │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: CUDA SETUP: Setup Failed!
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/local/bin/accelerate:8 in │
│ │
│ 5 from accelerate.commands.accelerate_cli import main │
│ 6 if name == 'main': │
│ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py:45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:1104 in launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:567 in simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python3', 'train_network.py',
'--dataset_config=/content/drive/MyDrive/Loras/ACB/dataset_config.toml',
'--config_file=/content/drive/MyDrive/Loras/ACB/training_config.toml']' returned non-zero
exit status 1.

Perhaps you ran out of GPU time for the week?

Perhaps you ran out of GPU time for the week?

I didn't think about that

Same issue here. Definitely not a GPU time issue. Haven't used any in over a month.

Same issue here. Definitely not a GPU time issue. Haven't used any in over a month.

When did this start happening?

I've only just encountered it, but then I haven't used the notebook in weeks.

I can confirm this happens every time starting today. Seems Colab updated their libraries again. Every time they do this it becomes trickier...

I'll take a look

thank you!

same issue here today. Used yesterday with no issues.
`===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 122
CUDA SETUP: TODO: compile library for specific version: libbitsandbytes_cuda122.so
CUDA SETUP: Defaulting to libbitsandbytes.so...
CUDA SETUP: CUDA detection failed. Either CUDA driver not installed, CUDA not installed, or you have multiple conflicting CUDA libraries!
CUDA SETUP: If you compiled from source, try again with make CUDA_VERSION=DETECTED_CUDA_VERSION for example, make CUDA_VERSION=113.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /content/kohya-trainer/train_network.py:873 in │
│ │
│ 870 │ args = parser.parse_args() │
│ 871 │ args = train_util.read_config_from_file(args, parser) │
│ 872 │ │
│ ❱ 873 │ train(args) │
│ 874 │
│ │
│ /content/kohya-trainer/train_network.py:262 in train │
│ │
│ 259 │ │ ) │
│ 260 │ │ trainable_params = network.prepare_optimizer_params(args.text_encoder_lr, args.u │
│ 261 │ │
│ ❱ 262 │ optimizer_name, optimizer_args, optimizer = train_util.get_optimizer(args, trainable │
│ 263 │ │
│ 264 │ # dataloaderを準備する │
│ 265 │ # DataLoaderのプロセス数:0はメインプロセスになる │
│ │
│ /content/kohya-trainer/library/train_util.py:2700 in get_optimizer │
│ │
│ 2697 │ │
│ 2698 │ if optimizer_type == "AdamW8bit".lower(): │
│ 2699 │ │ try: │
│ ❱ 2700 │ │ │ import bitsandbytes as bnb │
│ 2701 │ │ except ImportError: │
│ 2702 │ │ │ raise ImportError("No bitsand bytes / bitsandbytesがインストールされていない │
│ 2703 │ │ print(f"use 8-bit AdamW optimizer | {optimizer_kwargs}") │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/init.py:6 in │
│ │
│ 3 # This source code is licensed under the MIT license found in the │
│ 4 # LICENSE file in the root directory of this source tree. │
│ 5 │
│ ❱ 6 from .autograd._functions import ( │
│ 7 │ MatmulLtState, │
│ 8 │ bmm_cublas, │
│ 9 │ matmul, │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/autograd/_functions.py:5 in │
│ │
│ 2 import warnings │
│ 3 │
│ 4 import torch │
│ ❱ 5 import bitsandbytes.functional as F │
│ 6 │
│ 7 from dataclasses import dataclass │
│ 8 from functools import reduce # Required in Python 3 │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/functional.py:13 in │
│ │
│ 10 from typing import Tuple │
│ 11 from torch import Tensor │
│ 12 │
│ ❱ 13 from .cextension import COMPILED_WITH_CUDA, lib │
│ 14 from functools import reduce # Required in Python 3 │
│ 15 │
│ 16 # math.prod not compatible with python < 3.8 │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:41 in │
│ │
│ 38 │ │ return cls._instance │
│ 39 │
│ 40 │
│ ❱ 41 lib = CUDALibrary_Singleton.get_instance().lib │
│ 42 try: │
│ 43 │ lib.cadam32bit_g32 │
│ 44 │ lib.get_context.restype = ct.c_void_p │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:37 in get_instance │
│ │
│ 34 │ def get_instance(cls): │
│ 35 │ │ if cls._instance is None: │
│ 36 │ │ │ cls._instance = cls.new(cls) │
│ ❱ 37 │ │ │ cls._instance.initialize() │
│ 38 │ │ return cls._instance │
│ 39 │
│ 40 │
│ │
│ /usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:27 in initialize │
│ │
│ 24 │ │ │ if not binary_path.exists(): │
│ 25 │ │ │ │ print('CUDA SETUP: CUDA detection failed. Either CUDA driver not install │
│ 26 │ │ │ │ print('CUDA SETUP: If you compiled from source, try again with make CUD │ │ ❱ 27 │ │ │ │ raise Exception('CUDA SETUP: Setup Failed!') │ │ 28 │ │ │ self.lib = ct.cdll.LoadLibrary(binary_path) │ │ 29 │ │ else: │ │ 30 │ │ │ print(f"CUDA SETUP: Loading binary {binary_path}...") │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ Exception: CUDA SETUP: Setup Failed! ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /usr/local/bin/accelerate:8 in <module> │ │ │ │ 5 from accelerate.commands.accelerate_cli import main │ │ 6 if __name__ == '__main__': │ │ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │ │ ❱ 8 │ sys.exit(main()) │ │ 9 │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py:45 in main │ │ │ │ 42 │ │ exit(1) │ │ 43 │ │ │ 44 │ # Run │ │ ❱ 45 │ args.func(args) │ │ 46 │ │ 47 │ │ 48 if __name__ == "__main__": │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:1104 in launch_command │ │ │ │ 1101 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │ │ 1102 │ │ sagemaker_launcher(defaults, args) │ │ 1103 │ else: │ │ ❱ 1104 │ │ simple_launcher(args) │ │ 1105 │ │ 1106 │ │ 1107 def main(): │ │ │ │ /usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py:567 in simple_launcher │ │ │ │ 564 │ process = subprocess.Popen(cmd, env=current_env) │ │ 565 │ process.wait() │ │ 566 │ if process.returncode != 0: │ │ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │ │ 568 │ │ 569 │ │ 570 def multi_gpu_launcher(args): │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ CalledProcessError: Command '['/usr/bin/python3', 'train_network.py', '--dataset_config=/content/drive/MyDrive/Loras/67Impala/dataset_config.toml', '--config_file=/content/drive/MyDrive/Loras/67Impala/training_config.toml']' returned non-zero exit status 1.

Just as a data point, this was working five hours ago. Best of luck fixing this.

I can confirm this happens every time starting today. Seems Colab updated their libraries again. Every time they do this it becomes trickier...

I'll take a look

Thanks for putting efforts on this.

It comes down to this:

CUDA backend failed to initialize: Found CUDA version 12010, but JAX was built against version 12020, which is newer. The copy of CUDA that is installed must be at least as new as the version against which JAX was built. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

I can't find a way to update CUDA or downgrade JAX properly.

If someone could help, we would all be thankful.

It comes down to this:

CUDA backend failed to initialize: Found CUDA version 12010, but JAX was built against version 12020, which is newer. The copy of CUDA that is installed must be at least as new as the version against which JAX was built. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

I can't find a way to update CUDA or downgrade JAX properly.

If someone could help, we would all be thankful.

Couldn't we just downgrade the JAX in terminal? If we have colab pro.

You don't need the colab pro terminal for that. Just need the right command.

command

Yeah I just noticed that, the command below is not working:

pip install -U "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

The program is still not working

I tried several notes on training LoRA on Colab, and they all had the same problem, regarding CUDA...
If anyone could figure it out, I think it would be a really great thing. 😣

A friend said it started to work after running this command:

!pip install --upgrade bitsandbytes

Haven't try it myself but i'll share anyways.

A friend said it started to work after running this command:

!pip install --upgrade bitsandbytes

Haven't try it myself but i'll share anyways.

It works! Thanks a lot for sharing!

A friend said it started to work after running this command:
!pip install --upgrade bitsandbytes
Haven't try it myself but i'll share anyways.

It works! Thanks a lot for sharing!

Where to put the command?

A friend said it started to work after running this command:
!pip install --upgrade bitsandbytes
Haven't try it myself but i'll share anyways.

It works! Thanks a lot for sharing!

Where to put the command?

I put it at the bottom of install dependencies function, as attached:

image

I can confirm that the suggested addition works as described

Installing collected packages: bitsandbytes
Attempting uninstall: bitsandbytes
Found existing installation: bitsandbytes 0.35.0
Uninstalling bitsandbytes-0.35.0:
Successfully uninstalled bitsandbytes-0.35.0
Successfully installed bitsandbytes-0.41.3.post2

✅ Installation finished in 148 seconds.

its actually worked
Screenshot_648
thankyou

A friend said it started to work after running this command:

!pip install --upgrade bitsandbytes

Haven't try it myself but i'll share anyways.

Thank you lots. I have added the upgraded bitsandbytes version to the requirements. The trainer is working again, no changes needed as of right now.

I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?

I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?I'm still not getting it, what can I do?

No need to do anything anymore, just go to the new lora training colab note link to use it, hollowstrawberry has been updated.