This repo contains the codes and pretrained models for our paper:
Mask Attention Networks: Rethinking and Strengthen Transformer
The two sub-directories includes reproducible codes and instructions for the machine translation and abstractive summarization. Please find the READMEs in the sub-directories for the detailed instructions for reproduction.