Albert-Ma/study-notes

BPE, various NORM in deep learning

Opened this issue · 1 comments

Neural Machine Translation of Rare Words with Subword Units
dropout
batch normalization
layer norm