bug on google colabs code

Question

bug on google colabs code

Closed this issue 21 days ago · 7 comments

View-my-Git-Lab-krafi commented 22 days ago

Reminder

I have read the README and searched the existing issues.

Reproduction

Fine-tune model via Command Line
I didnt do anything change , just tried as it is


/content/LLaMA-Factory
2024-05-18 12:35:28.953301: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-18 12:35:28.953353: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-18 12:35:28.954649: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-18 12:35:30.175448: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
05/18/2024 12:35:33 - WARNING - llamafactory.hparams.parser - We recommend enable `upcast_layernorm` in quantized training.
05/18/2024 12:35:33 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, compute dtype: torch.float16
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
tokenizer_config.json: 100% 51.1k/51.1k [00:00<00:00, 66.3MB/s]
tokenizer.json: 100% 9.09M/9.09M [00:01<00:00, 6.23MB/s]
special_tokens_map.json: 100% 459/459 [00:00<00:00, 2.69MB/s]
[INFO|tokenization_utils_base.py:2087] 2024-05-18 12:35:37,198 >> loading file tokenizer.json from cache at /root/.cache/huggingface/hub/models--unsloth--llama-3-8b-Instruct-bnb-4bit/snapshots/2950abc9d0b34ddd43fd52bbf0d7dca82807ce96/tokenizer.json
[INFO|tokenization_utils_base.py:2087] 2024-05-18 12:35:37,198 >> loading file added_tokens.json from cache at None
[INFO|tokenization_utils_base.py:2087] 2024-05-18 12:35:37,198 >> loading file special_tokens_map.json from cache at /root/.cache/huggingface/hub/models--unsloth--llama-3-8b-Instruct-bnb-4bit/snapshots/2950abc9d0b34ddd43fd52bbf0d7dca82807ce96/special_tokens_map.json
[INFO|tokenization_utils_base.py:2087] 2024-05-18 12:35:37,198 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--unsloth--llama-3-8b-Instruct-bnb-4bit/snapshots/2950abc9d0b34ddd43fd52bbf0d7dca82807ce96/tokenizer_config.json
[WARNING|logging.py:314] 2024-05-18 12:35:37,601 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
05/18/2024 12:35:37 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
05/18/2024 12:35:37 - INFO - llamafactory.data.loader - Loading dataset identity.json...
Generating train split: 91 examples [00:00, 5377.99 examples/s]
Converting format of dataset: 100% 91/91 [00:00<00:00, 7138.91 examples/s]
05/18/2024 12:35:38 - INFO - llamafactory.data.loader - Loading dataset llamafactory/alpaca_gpt4_en...
Downloading readme: 100% 373/373 [00:00<00:00, 2.43MB/s]
Downloading data: 100% 43.3M/43.3M [00:00<00:00, 51.9MB/s]
Generating train split: 51983 examples [00:01, 32182.19 examples/s]
Converting format of dataset: 100% 500/500 [00:00<00:00, 42058.28 examples/s]
Running tokenizer on dataset: 100% 591/591 [00:00<00:00, 1637.02 examples/s]
input_ids:
[128000, 128006, 9125, 128007, 271, 2675, 527, 264, 11190, 18328, 13, 128009, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 445, 81101, 12, 18, 11, 459, 15592, 18328, 8040, 555, 445, 8921, 4940, 17367, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
inputs:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Hello! I am Llama-3, an AI assistant developed by LLaMA Factory. How can I assist you today?<|eot_id|>
label_ids:
[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 445, 81101, 12, 18, 11, 459, 15592, 18328, 8040, 555, 445, 8921, 4940, 17367, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
labels:
Hello! I am Llama-3, an AI assistant developed by LLaMA Factory. How can I assist you today?<|eot_id|>
config.json: 100% 1.15k/1.15k [00:00<00:00, 6.70MB/s]
[INFO|configuration_utils.py:726] 2024-05-18 12:35:49,628 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--unsloth--llama-3-8b-Instruct-bnb-4bit/snapshots/2950abc9d0b34ddd43fd52bbf0d7dca82807ce96/config.json
[INFO|configuration_utils.py:789] 2024-05-18 12:35:49,629 >> Model config LlamaConfig {
  "_name_or_path": "unsloth/llama-3-8b-Instruct-bnb-4bit",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 128000,
  "eos_token_id": 128009,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 8192,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "pretraining_tp": 1,
  "quantization_config": {
    "_load_in_4bit": true,
    "_load_in_8bit": false,
    "bnb_4bit_compute_dtype": "bfloat16",
    "bnb_4bit_quant_type": "nf4",
    "bnb_4bit_use_double_quant": true,
    "llm_int8_enable_fp32_cpu_offload": false,
    "llm_int8_has_fp16_weight": false,
    "llm_int8_skip_modules": null,
    "llm_int8_threshold": 6.0,
    "load_in_4bit": true,
    "load_in_8bit": false,
    "quant_method": "bitsandbytes"
  },
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "rope_theta": 500000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.40.2",
  "use_cache": true,
  "vocab_size": 128256
}

05/18/2024 12:35:49 - INFO - llamafactory.model.utils.quantization - Loading ?-bit BITSANDBYTES-quantized model.
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
Traceback (most recent call last):
  File "/usr/local/bin/llamafactory-cli", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/llamafactory/cli.py", line 65, in main
    run_exp()
  File "/usr/local/lib/python3.10/dist-packages/llamafactory/train/tuner.py", line 34, in run_exp
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/usr/local/lib/python3.10/dist-packages/llamafactory/train/sft/workflow.py", line 34, in run_sft
    model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
  File "/usr/local/lib/python3.10/dist-packages/llamafactory/model/loader.py", line 124, in load_model
    model = load_unsloth_pretrained_model(config, model_args)
  File "/usr/local/lib/python3.10/dist-packages/llamafactory/model/utils/unsloth.py", line 39, in load_unsloth_pretrained_model
    from unsloth import FastLanguageModel
  File "/usr/local/lib/python3.10/dist-packages/unsloth/__init__.py", line 113, in <module>
    from .models import *
  File "/usr/local/lib/python3.10/dist-packages/unsloth/models/__init__.py", line 15, in <module>
    from .loader  import FastLanguageModel
  File "/usr/local/lib/python3.10/dist-packages/unsloth/models/loader.py", line 15, in <module>
    from .llama import FastLlamaModel, logger
  File "/usr/local/lib/python3.10/dist-packages/unsloth/models/llama.py", line 27, in <module>
    from ._utils import *
  File "/usr/local/lib/python3.10/dist-packages/unsloth/models/_utils.py", line 60, in <module>
    import xformers.ops.fmha as xformers
ModuleNotFoundError: No module named 'xformers'

Expected behavior

No response

System Info

No response

Others

No response

Answer 1 · 2024-05-18T12:57:44.000Z

Hi, even I too encountered this issue but I thought it was a local one but it seems you too have faced the same error. My best guess to why this might be occurring is due to this line

!pip install --no-deps xformers<0.0.26

as the logs suggests this

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy 3.7.4 requires typer<0.10.0,>=0.3.0, but you have typer 0.12.3 which is incompatible.
weasel 0.3.4 requires typer<0.10.0,>=0.3.0, but you have typer 0.12.3 which is incompatible.

atleast that's the thing in colab for command line one but the Llama board one works fine.

Answer 2 · 2024-05-18T13:28:57.000Z

i think its conflict with another package , i tried !pip install xformers

Answer 3 · 2024-05-18T13:33:45.000Z

Traceback (most recent call last):
File "C:\Users\krafi\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\krafi\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\krafi\AppData\Local\Programs\Python\Python39\Scripts\llamafactory-cli.exe_main.py", line 4, in
ModuleNotFoundError: No module named 'llmtuner'

Answer 4 · 2024-05-18T13:37:57.000Z

i see !pip install --no-deps xformers<0.0.26 this line exists on the colabs ,, still for me it didnt fix, i dont know what changed,, please someone check in , just run all ... close the GUI ,, run command line

Answer 5 · 2024-05-18T13:40:22.000Z

right now i test this on a new browser with another google account didnt work,,, i used colab this code last week it worked at that time.. maybe somekind of update issue on github code

Answer 6 · 2024-05-18T13:42:28.000Z

You can consider using the latest notebook: https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing

Answer 7 · 2024-05-18T20:53:35.000Z

yes.. it worked .. check my project to create custom private dataset using llama3 offline.
https://gitlab.com/krafi/tuna-asyncio-with-llama.git
it worked.. thanks a lot for your support .