Reasons why pre-learning was not effective
Opened this issue · 0 comments
abebe9849 commented
Pre-training the transformer seems like an effective idea. Do you have any theories as to why it didn't work?
Opened this issue · 0 comments
Pre-training the transformer seems like an effective idea. Do you have any theories as to why it didn't work?