Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021.
Note. A detailed documentation is coming soon.
PLBART is pre-trained on Java and Python functions and natural language descriptions collected from Github and StackOverflow.
We evaluated PLBART on five tasks.
- Code summarization [REF]
- Code generation [REF]
- Code translation [REF]
- Clone detection [REF]
- Vulnerability REF [REF]
- We will publish the pretrained PLBART checkpoint soon.
- We list all the files in this repository here.
PLBART uses Fairseq, codeXglue, and TransCoder and thanks the authors of these works for their contribution.
@inproceedings{ahmad2020summarization,
author = {Ahmad, Wasi Uddin and Chakraborty, Saikat and Ray, Baishakhi and Chang, Kai-Wei},
booktitle = {Proceedings of the 2021 Conference of the North {A}merican Chapter of the Association for Computational Linguistics},
title = {Unified Pre-training for Program Understanding and Generation},
year = {2021}
}