microsoft/LoRA

[Minor] Possible typos in weight initialization

awgu opened this issue · 0 comments

awgu commented

The recent commit a0a92e0 flipped A and B in the comment for the LoRA Linear module:

LoRA/loralib/layers.py

Lines 119 to 125 in a0a92e0

def reset_parameters(self):
nn.Linear.reset_parameters(self)
if hasattr(self, 'lora_A'):
# initialize B the same way as the default for nn.Linear and A to zero
# this is different than what is described in the paper but should not affect performance
nn.init.kaiming_uniform_(self.lora_A, a=math.sqrt(5))
nn.init.zeros_(self.lora_B)

The LoRA Embedding module similarly has the initialization flipped (not sure if this is intentional):

LoRA/loralib/layers.py

Lines 55 to 60 in a0a92e0

def reset_parameters(self):
nn.Embedding.reset_parameters(self)
if hasattr(self, 'lora_A'):
# initialize A the same way as the default for nn.Linear and B to zero
nn.init.zeros_(self.lora_A)
nn.init.normal_(self.lora_B)

Following the paper, I would expect nn.init.normal_(self.lora_A) and nn.init.zeros_(self.lora_B).

I can open a PR to fix these if you want (though, I cannot seem to save the file without removing trailing whitespaces for some reason).