The scaling parameter is not learned?
Opened this issue · 0 comments
LindgeW commented
The paper claims the scaling parameter is learned, but your implementation is a fixed scalar.
Opened this issue · 0 comments
The paper claims the scaling parameter is learned, but your implementation is a fixed scalar.