Encode Once and Decode in Parallel

Our code will be made available upon acceptance. Stay tuned for updates!

The offical Pytorch implementation of Encode Once and Decode in Parallel: Efficient Transformer Decoding. Please refer to our paper for details. Encode Once and Decode in Parallel: Efficient Transformer Decoding. Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf. Preprint. 2024. [paper]

If you use any source codes included in this repository in your work, please cite the following paper. The bibtex is listed below:

@misc{lu2024encode,
      title={Encode Once and Decode in Parallel: Efficient Transformer Decoding}, 
      author={Bo-Ru Lu and Nikita Haduong and Chien-Yu Lin and Hao Cheng and Noah A. Smith and Mari Ostendorf},
      year={2024},
      eprint={2403.13112},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}