huggingface/peft

LoraConfig conflict when using `layers_to_transform` in `LlamaModel`

Opened this issue · 4 comments

System Info

peft: 0.13.2
transformers: 4.43.1

Who can help?

@BenjaminBossan @sayakpaul

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

When I tried to use LoraConfig and aimed to apply lora in first and last layers like:

lora_config = LoraConfig(
    r = 8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    layers_to_transform=[0,31],
    lora_dropout=0,
    bias = "none",
)
model = LlamaModel.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16)
llama_model = get_peft_model(model, lora_config)

It came the problem that:

*** ValueError: Target modules ['q_proj', 'k_proj', 'v_proj', 'o_proj'] not found in the base model. Please check the target modules and try again.

The similar thing happen if I use layers_pattern instead of target_modules (but it should be my misunderstanding of layers_pattern):

lora_config = LoraConfig(
    ...
    layers_to_transform = 1, 
    layers_pattern = ["q_proj", "k_proj", "v_proj", "o_proj"], 
    ...
)
get_peft_model(model, lora_config)
*** ValueError: Target modules {'v_proj', 'q_proj'} not found in the base model. Please check the target modules and try again.

But this time the problem shoud be the problem of default value of target_modules.

However, when I use model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16, trust_remote_code=True) instead, it made it.

Expected behavior

I'm not sure if it was the problem of LlamaModel. And I do also confuse about the use of layers_patten, since of doc of LoRA mentioned:

  • layers_to_transform: List of layers to be transformed by LoRA. If not specified, all layers in target_modules are transformed.
  • layers_pattern: Pattern to match layer names in target_modules, if layers_to_transform is specified. By default PeftModel will look at common layer pattern (layers, h, blocks, etc.), use it for exotic and custom models.

It should work with layers_to_transform, however, I didn'd find a suitable approach to use. Maybe some examples can be put in class LoraConfig(PeftConfig)?

Thanks for reporting the issue. Indeed, the usage of layers_to_transform and layers_pattern is a bit confusing and the error message is not helpful.

The idea here is that if we have a nn.ModuleList with 32 layers in this case, the layers_pattern should designate this nn.ModuleList: layers_pattern="layers". Therefore, this works for me:

lora_config = LoraConfig(
    r = 8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj"],
    layers_to_transform=[0, 31],
    layers_pattern="layers",
    lora_dropout=0,
    bias = "none",
)

However, as you noted, using LlamaModel directly does not work. This is a result of how we specify a regex and I think we can amend it to work with LlamaModel too. So for now, please use AutoModelForCausalLM with the LoraConfig I showed and you should be good.

The TODOs from this issue are:

  1. Improve the documentation of these arguments to clarify what users need to pass.
  2. Amend the regex to make the prefix before the layers_pattern optional.
  3. Adjust the error message for the case that users pass layers_to_transform and layers_pattern (right now, the error message assumes that users only pass target_modules.

For point 3, would you be interested in tackling this @JINO-ROHIT since you refactored that part in #2102?

@BenjaminBossan yeap il be happy to work on this

@Evan02580 I created a PR to improve the docs in #2157 and another PR to adapt the regex in #2158. For the latter, I'm unsure if we should proceed though, as technically this is a backwards-incompatible change.

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.