LoraConfig conflict when using `layers_to_transform` in `LlamaModel`

Question

LoraConfig conflict when using `layers_to_transform` in `LlamaModel`

Opened this issue a month ago · 4 comments

Evan02580 commented a month ago

System Info

peft: 0.13.2
transformers: 4.43.1

Who can help?

@BenjaminBossan @sayakpaul

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

When I tried to use LoraConfig and aimed to apply lora in first and last layers like:

lora_config = LoraConfig(
    r = 8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    layers_to_transform=[0,31],
    lora_dropout=0,
    bias = "none",
)
model = LlamaModel.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16)
llama_model = get_peft_model(model, lora_config)

It came the problem that:

*** ValueError: Target modules ['q_proj', 'k_proj', 'v_proj', 'o_proj'] not found in the base model. Please check the target modules and try again.

The similar thing happen if I use layers_pattern instead of target_modules (but it should be my misunderstanding of layers_pattern):

lora_config = LoraConfig(
    ...
    layers_to_transform = 1, 
    layers_pattern = ["q_proj", "k_proj", "v_proj", "o_proj"], 
    ...
)
get_peft_model(model, lora_config)

*** ValueError: Target modules {'v_proj', 'q_proj'} not found in the base model. Please check the target modules and try again.

But this time the problem shoud be the problem of default value of target_modules.

However, when I use model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16, trust_remote_code=True) instead, it made it.

Expected behavior

I'm not sure if it was the problem of LlamaModel. And I do also confuse about the use of layers_patten, since of doc of LoRA mentioned:

layers_to_transform: List of layers to be transformed by LoRA. If not specified, all layers in target_modules are transformed.
layers_pattern: Pattern to match layer names in target_modules, if layers_to_transform is specified. By default PeftModel will look at common layer pattern (layers, h, blocks, etc.), use it for exotic and custom models.

It should work with layers_to_transform, however, I didn'd find a suitable approach to use. Maybe some examples can be put in class LoraConfig(PeftConfig)?

Answer 1 · 2024-10-17T12:30:23.000Z

Thanks for reporting the issue. Indeed, the usage of layers_to_transform and layers_pattern is a bit confusing and the error message is not helpful.

The idea here is that if we have a nn.ModuleList with 32 layers in this case, the layers_pattern should designate this nn.ModuleList: layers_pattern="layers". Therefore, this works for me:

lora_config = LoraConfig(
    r = 8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj"],
    layers_to_transform=[0, 31],
    layers_pattern="layers",
    lora_dropout=0,
    bias = "none",
)

However, as you noted, using LlamaModel directly does not work. This is a result of how we specify a regex and I think we can amend it to work with LlamaModel too. So for now, please use AutoModelForCausalLM with the LoraConfig I showed and you should be good.

The TODOs from this issue are:

Improve the documentation of these arguments to clarify what users need to pass.
Amend the regex to make the prefix before the layers_pattern optional.
Adjust the error message for the case that users pass layers_to_transform and layers_pattern (right now, the error message assumes that users only pass target_modules.

For point 3, would you be interested in tackling this @JINO-ROHIT since you refactored that part in #2102?

Answer 2 · 2024-10-17T12:37:38.000Z

@BenjaminBossan yeap il be happy to work on this

Answer 3 · 2024-10-17T15:11:01.000Z

@Evan02580 I created a PR to improve the docs in #2157 and another PR to adapt the regex in #2158. For the latter, I'm unsure if we should proceed though, as technically this is a backwards-incompatible change.

Answer 4 · 2024-11-16T15:03:37.000Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.