huggingface/peft

Error while loading PEFT lora model

Zuhashaik opened this issue · 4 comments

Trained model using this lora config:

model.resize_token_embeddings(len(tokenizer))
Embedding(33004, 4096)

lora_alpha = 2048
lora_dropout = 0.3
lora_r = 1024
target_modules=[
    # "up_proj",
    "o_proj",
    "v_proj",
    "gate_proj",
    "q_proj",
    # "down_proj",
    "k_proj"
  ]
modules_to_save = ["lm_head", "embed_tokens"]
peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    target_modules = target_modules,
    bias="none",
    modules_to_save = modules_to_save,
    task_type="CAUSAL_LM")

model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

trainable params: 1,839,038,464 || all params: 8,585,678,848 || trainable%: 21.419837575550556

trainer = Trainer(
    model=model, 
    tokenizer=tokenizer, 
    args=training_args, 
    train_dataset=sample_train, 
    eval_dataset=sample_val, 
)
trainer.train()

Where i wnated to keep my word embedding layer trainable..

Training went smooothly no problem, but while loading the checkpoints using AutoModelForCausalLM and PeftModel gettitng the same error:

from peft import LoraConfig
from transformers import AutoModelForCausalLM
from peft import PeftModel
import torch

repo_name = '/media/iiit/Karvalo/zuhair/Proj-multimodal/libri100-english-transcribe-1024,2048,0.2,kqvog-hubert_proj_grouped-wte/checkpoint-1'

config = LoraConfig.from_pretrained(repo_name)


model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
# Load the LoRA model
inference_model = PeftModel.from_pretrained(model, repo_name)
Loading checkpoint shards: 100%
 2/2 [00:08<00:00,  3.89s/it]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[3], line 18
     11 model = AutoModelForCausalLM.from_pretrained(
     12     config.base_model_name_or_path,
     13     device_map="auto",
     14     torch_dtype=torch.bfloat16,
     15     trust_remote_code=True,
     16 )
     17 # Load the LoRA model
---> 18 inference_model = PeftModel.from_pretrained(model, repo_name)

File ~/anaconda3/lib/python3.9/site-packages/peft/peft_model.py:356, in PeftModel.from_pretrained(cls, model, model_id, adapter_name, is_trainable, config, **kwargs)
    354 else:
    355     model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[config.task_type](model, config, adapter_name)
--> 356 model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
    357 return model

File ~/anaconda3/lib/python3.9/site-packages/peft/peft_model.py:730, in PeftModel.load_adapter(self, model_id, adapter_name, is_trainable, **kwargs)
    727 adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
    729 # load the weights into the model
--> 730 load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
    731 if (
    732     (getattr(self, "hf_device_map", None) is not None)
    733     and (len(set(self.hf_device_map.values()).intersection({"cpu", "disk"})) > 0)
    734     and len(self.peft_config) == 1
    735 ):
    736     device_map = kwargs.get("device_map", "auto")

File ~/anaconda3/lib/python3.9/site-packages/peft/utils/save_and_load.py:249, in set_peft_model_state_dict(model, peft_model_state_dict, adapter_name)
    246 else:
    247     raise NotImplementedError
--> 249 load_result = model.load_state_dict(peft_model_state_dict, strict=False)
    250 if config.is_prompt_learning:
    251     model.prompt_encoder[adapter_name].embedding.load_state_dict(
    252         {"weight": peft_model_state_dict["prompt_embeddings"]}, strict=True
    253     )

File ~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py:2189, in Module.load_state_dict(self, state_dict, strict, assign)
   2184         error_msgs.insert(
   2185             0, 'Missing key(s) in state_dict: {}. '.format(
   2186                 ', '.join(f'"{k}"' for k in missing_keys)))
   2188 if len(error_msgs) > 0:
-> 2189     raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   2190                        self.__class__.__name__, "\n\t".join(error_msgs)))
   2191 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
	size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
	size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

And also my checkpoints are very huge in size there is a problem please help me to clear out
check the image of my checkpoint :
Screenshot from 2024-05-01 02-04-02

Training went smooothly no problem, but while loading the checkpoints using AutoModelForCausalLM and PeftModel gettitng the same error:

Let's look at the error message:

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([33004, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

The sizes of the loaded model and the weights in the checkpoint differ. You resized the embedding layer earlier:

model.resize_token_embeddings(len(tokenizer))

and you have to perform the same steps when loading the model, otherwise the sizes don't match.

And also my checkpoints are very huge in size there is a problem please help me to clear out

For this, let's also look at something you posted earlier:

trainable params: 1,839,038,464 || all params: 8,585,678,848 || trainable%: 21.419837575550556

As you can see, you have 1.8M trainable weights, 21% of overall weights. That's quite a lot, normally, folks have <1%. The reason for this high number is that you set a rank of 1024, which is very big, hardly reducing the sizes of the weights that are kept frozen.

@BenjaminBossan Yes i wanted rank to be 1024, because i want llm to learn a new task which is not related to text modality and giving more preference / weight to lora so alpha is 2X(rank).

I found out how to fix my issue. The missing line in my code is:
model.resize_token_embeddings(len(tokenizer))

Before, I was just loading the model directly using AutoModelForCausalLM or PeftModel.

Here's what I need to do:

  1. Start by loading the base model and your pretrained tokenizer (the one your model trained on).
  2. Adjust the size of the model's word embedding matrix to match the size of your tokenizer's vocabulary.
  3. Finally, load the model using PeftModel.
model_name_or_path = 'meta-llama/Llama-2-7b-chat-hf'
model = AutoModelForCausalLM.from_pretrained(
    model_name_or_path,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
repo_name = '<PATH-TO-YOUR-REPO>'
tokenizer = LlamaTokenizer.from_pretrained(
        repo_name,
    )
model.resize_token_embeddings(len(tokenizer))

model = PeftModel.from_pretrained(model, repo_name)

I found out how to fix my issue.

Great.

Yes i wanted rank to be 1024, because i want llm to learn a new task which is not related to text modality and giving more preference / weight to lora so alpha is 2X(rank).

I'm not saying this can't work, but I have never seen such huge LoRA ranks being used, there is hardly any advantage to full fine-tuning in this case.