avalonstrel/GatedConvolution_pytorch

About self attention

Janspiry opened this issue · 0 comments

Hello, In the Self_Attn modules, the value of gamma is torch.zeros(1), then get the out by out = x + gamma*out,Why the vlaue of gamma is zero rather than others like torch.ones(1)?