LoraConfig conflict when using `layers_to_transform` in `LlamaModel`
Opened this issue · 4 comments
System Info
peft: 0.13.2
transformers: 4.43.1
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder - My own task or dataset (give details below)
Reproduction
When I tried to use LoraConfig and aimed to apply lora in first and last layers like:
lora_config = LoraConfig(
r = 8,
lora_alpha=16,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
layers_to_transform=[0,31],
lora_dropout=0,
bias = "none",
)
model = LlamaModel.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16)
llama_model = get_peft_model(model, lora_config)
It came the problem that:
*** ValueError: Target modules ['q_proj', 'k_proj', 'v_proj', 'o_proj'] not found in the base model. Please check the target modules and try again.
The similar thing happen if I use layers_pattern
instead of target_modules
(but it should be my misunderstanding of layers_pattern
):
lora_config = LoraConfig(
...
layers_to_transform = 1,
layers_pattern = ["q_proj", "k_proj", "v_proj", "o_proj"],
...
)
get_peft_model(model, lora_config)
*** ValueError: Target modules {'v_proj', 'q_proj'} not found in the base model. Please check the target modules and try again.
But this time the problem shoud be the problem of default value of target_modules
.
However, when I use model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16, trust_remote_code=True)
instead, it made it.
Expected behavior
I'm not sure if it was the problem of LlamaModel
. And I do also confuse about the use of layers_patten
, since of doc of LoRA mentioned:
layers_to_transform
: List of layers to be transformed by LoRA. If not specified, all layers intarget_modules
are transformed.layers_pattern
: Pattern to match layer names intarget_modules
, iflayers_to_transform
is specified. By defaultPeftModel
will look at common layer pattern (layers
,h
,blocks
, etc.), use it for exotic and custom models.
It should work with layers_to_transform
, however, I didn'd find a suitable approach to use. Maybe some examples can be put in class LoraConfig(PeftConfig)
?
Thanks for reporting the issue. Indeed, the usage of layers_to_transform
and layers_pattern
is a bit confusing and the error message is not helpful.
The idea here is that if we have a nn.ModuleList
with 32 layers in this case, the layers_pattern
should designate this nn.ModuleList
: layers_pattern="layers"
. Therefore, this works for me:
lora_config = LoraConfig(
r = 8,
lora_alpha=16,
target_modules=["q_proj", "k_proj", "v_proj"],
layers_to_transform=[0, 31],
layers_pattern="layers",
lora_dropout=0,
bias = "none",
)
However, as you noted, using LlamaModel
directly does not work. This is a result of how we specify a regex and I think we can amend it to work with LlamaModel
too. So for now, please use AutoModelForCausalLM
with the LoraConfig
I showed and you should be good.
The TODOs from this issue are:
- Improve the documentation of these arguments to clarify what users need to pass.
- Amend the regex to make the prefix before the
layers_pattern
optional. - Adjust the error message for the case that users pass
layers_to_transform
andlayers_pattern
(right now, the error message assumes that users only passtarget_modules
.
For point 3, would you be interested in tackling this @JINO-ROHIT since you refactored that part in #2102?
@BenjaminBossan yeap il be happy to work on this
@Evan02580 I created a PR to improve the docs in #2157 and another PR to adapt the regex in #2158. For the latter, I'm unsure if we should proceed though, as technically this is a backwards-incompatible change.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.