Error while loading PEFT lora model
Zuhashaik opened this issue · 4 comments
Trained model using this lora config:
model.resize_token_embeddings(len(tokenizer))
Embedding(33004, 4096)
lora_alpha = 2048
lora_dropout = 0.3
lora_r = 1024
target_modules=[
# "up_proj",
"o_proj",
"v_proj",
"gate_proj",
"q_proj",
# "down_proj",
"k_proj"
]
modules_to_save = ["lm_head", "embed_tokens"]
peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
target_modules = target_modules,
bias="none",
modules_to_save = modules_to_save,
task_type="CAUSAL_LM")
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
trainable params: 1,839,038,464 || all params: 8,585,678,848 || trainable%: 21.419837575550556
trainer = Trainer(
model=model,
tokenizer=tokenizer,
args=training_args,
train_dataset=sample_train,
eval_dataset=sample_val,
)
trainer.train()
Where i wnated to keep my word embedding layer trainable..
Training went smooothly no problem, but while loading the checkpoints using AutoModelForCausalLM and PeftModel gettitng the same error:
from peft import LoraConfig
from transformers import AutoModelForCausalLM
from peft import PeftModel
import torch
repo_name = '/media/iiit/Karvalo/zuhair/Proj-multimodal/libri100-english-transcribe-1024,2048,0.2,kqvog-hubert_proj_grouped-wte/checkpoint-1'
config = LoraConfig.from_pretrained(repo_name)
model = AutoModelForCausalLM.from_pretrained(
config.base_model_name_or_path,
device_map="auto",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
)
# Load the LoRA model
inference_model = PeftModel.from_pretrained(model, repo_name)
Loading checkpoint shards: 100%
2/2 [00:08<00:00, 3.89s/it]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[3], line 18
11 model = AutoModelForCausalLM.from_pretrained(
12 config.base_model_name_or_path,
13 device_map="auto",
14 torch_dtype=torch.bfloat16,
15 trust_remote_code=True,
16 )
17 # Load the LoRA model
---> 18 inference_model = PeftModel.from_pretrained(model, repo_name)
File ~/anaconda3/lib/python3.9/site-packages/peft/peft_model.py:356, in PeftModel.from_pretrained(cls, model, model_id, adapter_name, is_trainable, config, **kwargs)
354 else:
355 model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[config.task_type](model, config, adapter_name)
--> 356 model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
357 return model
File ~/anaconda3/lib/python3.9/site-packages/peft/peft_model.py:730, in PeftModel.load_adapter(self, model_id, adapter_name, is_trainable, **kwargs)
727 adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
729 # load the weights into the model
--> 730 load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
731 if (
732 (getattr(self, "hf_device_map", None) is not None)
733 and (len(set(self.hf_device_map.values()).intersection({"cpu", "disk"})) > 0)
734 and len(self.peft_config) == 1
735 ):
736 device_map = kwargs.get("device_map", "auto")
File ~/anaconda3/lib/python3.9/site-packages/peft/utils/save_and_load.py:249, in set_peft_model_state_dict(model, peft_model_state_dict, adapter_name)
246 else:
247 raise NotImplementedError
--> 249 load_result = model.load_state_dict(peft_model_state_dict, strict=False)
250 if config.is_prompt_learning:
251 model.prompt_encoder[adapter_name].embedding.load_state_dict(
252 {"weight": peft_model_state_dict["prompt_embeddings"]}, strict=True
253 )
File ~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py:2189, in Module.load_state_dict(self, state_dict, strict, assign)
2184 error_msgs.insert(
2185 0, 'Missing key(s) in state_dict: {}. '.format(
2186 ', '.join(f'"{k}"' for k in missing_keys)))
2188 if len(error_msgs) > 0:
-> 2189 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
2190 self.__class__.__name__, "\n\t".join(error_msgs)))
2191 return _IncompatibleKeys(missing_keys, unexpected_keys)
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
And also my checkpoints are very huge in size there is a problem please help me to clear out
check the image of my checkpoint :
Training went smooothly no problem, but while loading the checkpoints using AutoModelForCausalLM and PeftModel gettitng the same error:
Let's look at the error message:
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
The sizes of the loaded model and the weights in the checkpoint differ. You resized the embedding layer earlier:
model.resize_token_embeddings(len(tokenizer))
and you have to perform the same steps when loading the model, otherwise the sizes don't match.
And also my checkpoints are very huge in size there is a problem please help me to clear out
For this, let's also look at something you posted earlier:
trainable params: 1,839,038,464 || all params: 8,585,678,848 || trainable%: 21.419837575550556
As you can see, you have 1.8M trainable weights, 21% of overall weights. That's quite a lot, normally, folks have <1%. The reason for this high number is that you set a rank of 1024, which is very big, hardly reducing the sizes of the weights that are kept frozen.
@BenjaminBossan Yes i wanted rank to be 1024, because i want llm to learn a new task which is not related to text modality and giving more preference / weight to lora so alpha is 2X(rank).
I found out how to fix my issue. The missing line in my code is:
model.resize_token_embeddings(len(tokenizer))
Before, I was just loading the model directly using AutoModelForCausalLM or PeftModel.
Here's what I need to do:
- Start by loading the base model and your pretrained tokenizer (the one your model trained on).
- Adjust the size of the model's word embedding matrix to match the size of your tokenizer's vocabulary.
- Finally, load the model using PeftModel.
model_name_or_path = 'meta-llama/Llama-2-7b-chat-hf'
model = AutoModelForCausalLM.from_pretrained(
model_name_or_path,
device_map="auto",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
)
repo_name = '<PATH-TO-YOUR-REPO>'
tokenizer = LlamaTokenizer.from_pretrained(
repo_name,
)
model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(model, repo_name)
I found out how to fix my issue.
Great.
Yes i wanted rank to be 1024, because i want llm to learn a new task which is not related to text modality and giving more preference / weight to lora so alpha is 2X(rank).
I'm not saying this can't work, but I have never seen such huge LoRA ranks being used, there is hardly any advantage to full fine-tuning in this case.