/DT-Fixup

Optimizing Deeper Transformers on Small Datasets https://arxiv.org/abs/2012.15355

Primary LanguagePython

DT-Fixup

Optimizing Deeper Transformers on Small Datasets

Paper published in ACL 2021: arXiv

Detailed instructions to replicate our results in the paper can be found in the folders spider and reclor.

Cite

If you found this codebase or our work useful, please cite:

@InProceedings{xu2021optimizing,
  author = {Xu, Peng and Kumar, Dhruv and Yang, Wei and Zi, Wenjie and Tang, Keyi and Huang, Chenyang and Cheung, Jackie Chi Kit and Prince, Simon J.D. and Cao, Yanshuai},
  title = {Optimizing Deeper Transformers on Small Datasets}
  booktitle = {The 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021)},
  month = {August},
  year = {2021},
  publisher = {ACL}
}