Surprising PPL on WMT 17
luffycodes opened this issue · 0 comments
luffycodes commented
Running the code with n_head set to 1 leads to PPL of 6.65 (other parameters are same as that in readme). The resulting log is attached below. I'm surprised by such low PPL because n_head set to default results in PPL of 11. Is this behaviour as expected?
"[ Epoch 356 ]
- (Training) ppl: 11.29374, accuracy: 74.314 %, elapse: 0.540 min
- (Validation) ppl: 6.65451, accuracy: 67.306 %, elapse: 0.006 min
- [Info] The checkpoint file has been updated."