Issues
- 4
- 3
可以添加transformer ,seq2seq 和 attention 这些吗?
#29 opened by zuowanbushiwo - 2
About Huber Loss
#26 opened by Zhangang1999 - 1
SOFTPLUS
#25 opened by Zhangang1999 - 8
Step should average the gradients by batch size.
#12 opened by w32zhong - 3
One suggestion on file header
#5 opened by w32zhong - 3
Avoid download dataset programmatically.
#1 opened by w32zhong - 2