lyhue1991/eat_pytorch_in_20_days

Transform 层源码中添加了很多dropout,但是paper中只说在add&layer norm 之前的子层加,具体那个正确呢?

pandaupc opened this issue · 0 comments

Transform 层源码中添加了很多dropout,但是paper中只说在add&layer norm 之前的子层加,具体那个正确呢?