How to set Lora_dropout=0 when loading trained peft model for inference?
flyliu2017 opened this issue · 2 comments
System Info
peft==0.10.0
transformers==4.39.3
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder - My own task or dataset (give details below)
Reproduction
class Linear(nn.Module, LoraLayer):
def forward(self, x: torch.Tensor, *args: Any, **kwargs: Any) -> torch.Tensor:
self._check_forward_args(x, *args, **kwargs)
adapter_names = kwargs.pop("adapter_names", None)
if self.disable_adapters:
if self.merged:
self.unmerge()
result = self.base_layer(x, *args, **kwargs)
elif adapter_names is not None:
result = self._mixed_batch_forward(x, *args, adapter_names=adapter_names, **kwargs)
elif self.merged:
result = self.base_layer(x, *args, **kwargs)
else:
result = self.base_layer(x, *args, **kwargs)
torch_result_dtype = result.dtype
for active_adapter in self.active_adapters:
if active_adapter not in self.lora_A.keys():
continue
lora_A = self.lora_A[active_adapter]
lora_B = self.lora_B[active_adapter]
dropout = self.lora_dropout[active_adapter]
scaling = self.scaling[active_adapter]
x = x.to(lora_A.weight.dtype)
if not self.use_dora[active_adapter]:
result = result + lora_B(lora_A(dropout(x))) * scaling
else:
x = dropout(x)
result = result + self._apply_dora(x, lora_A, lora_B, scaling, active_adapter)
result = result.to(torch_result_dtype)
return result
Expected behavior
We can see that lora_dropout
in forward function is working the same way whether under train or inference mode.
We can see that
lora_dropout
in forward function is working the same way whether under train or inference mode.
Did you try it out? The nn.Dropout
layer is not applying dropout unless it is in training
mode. Moreover, when we set dropout to 0 at initialization, self.dropout
is set to nn.Identity
. Please check if dropout is really applied in your case or if it's a misunderstanding of the code.
We can see that
lora_dropout
in forward function is working the same way whether under train or inference mode.Did you try it out? The
nn.Dropout
layer is not applying dropout unless it is intraining
mode. Moreover, when we set dropout to 0 at initialization,self.dropout
is set tonn.Identity
. Please check if dropout is really applied in your case or if it's a misunderstanding of the code.
Thank you! The key point is training
mode of the model! I train the model without evaluation, so the model after training is still in 'training' mode, which led to inconsistent performance between the model and the one loaded from a checkpoint.