huggingface/peft

How to set Lora_dropout=0 when loading trained peft model for inference?

flyliu2017 opened this issue · 2 comments

System Info

peft==0.10.0
transformers==4.39.3

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

class Linear(nn.Module, LoraLayer): 

   def forward(self, x: torch.Tensor, *args: Any, **kwargs: Any) -> torch.Tensor:
        self._check_forward_args(x, *args, **kwargs)
        adapter_names = kwargs.pop("adapter_names", None)

        if self.disable_adapters:
            if self.merged:
                self.unmerge()
            result = self.base_layer(x, *args, **kwargs)
        elif adapter_names is not None:
            result = self._mixed_batch_forward(x, *args, adapter_names=adapter_names, **kwargs)
        elif self.merged:
            result = self.base_layer(x, *args, **kwargs)
        else:
            result = self.base_layer(x, *args, **kwargs)
            torch_result_dtype = result.dtype
            for active_adapter in self.active_adapters:
                if active_adapter not in self.lora_A.keys():
                    continue
                lora_A = self.lora_A[active_adapter]
                lora_B = self.lora_B[active_adapter]
                dropout = self.lora_dropout[active_adapter]
                scaling = self.scaling[active_adapter]
                x = x.to(lora_A.weight.dtype)

                if not self.use_dora[active_adapter]:
                    result = result + lora_B(lora_A(dropout(x))) * scaling
                else:
                    x = dropout(x)
                    result = result + self._apply_dora(x, lora_A, lora_B, scaling, active_adapter)

            result = result.to(torch_result_dtype)

        return result

Expected behavior

We can see that lora_dropout in forward function is working the same way whether under train or inference mode.

We can see that lora_dropout in forward function is working the same way whether under train or inference mode.

Did you try it out? The nn.Dropout layer is not applying dropout unless it is in training mode. Moreover, when we set dropout to 0 at initialization, self.dropout is set to nn.Identity. Please check if dropout is really applied in your case or if it's a misunderstanding of the code.

We can see that lora_dropout in forward function is working the same way whether under train or inference mode.

Did you try it out? The nn.Dropout layer is not applying dropout unless it is in training mode. Moreover, when we set dropout to 0 at initialization, self.dropout is set to nn.Identity. Please check if dropout is really applied in your case or if it's a misunderstanding of the code.

Thank you! The key point is training mode of the model! I train the model without evaluation, so the model after training is still in 'training' mode, which led to inconsistent performance between the model and the one loaded from a checkpoint.