What is common.Scale(1) means?
Opened this issue · 1 comments
yunfanLu commented
class Scale(nn.Module):
def __init__(self, init_value=1e-3):
super().__init__()
self.scale = nn.Parameter(torch.FloatTensor([init_value]))
def forward(self, input):
return input * self.scale
When the self.scale=1, does this option does nothing?
Why do we need this layer?
yunfanLu commented
Is the self.scale learnable parameters 𝜆𝑥 in the paper?