regarding paper describing this model
wenouyang opened this issue · 1 comments
wenouyang commented
Hi, thanks for sharing the code, Are there any research paper discussing this model?
Thanks.
ASvyatkovskiy commented
Following paper discusses some aspects of the model (mainly, mixed precision float training):
https://dl.acm.org/citation.cfm?id=3146358
It also discusses the distributed training algorithm, learning rate scheduling, and the neural network architecture.
We have another paper which is now submitted to a journal, it will become available soon.