Data Augmentation for Low-Resource Neural Machine Translation
Opened this issue · 0 comments
kweonwooj commented
Abstract
- Propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthetically created contexts
- Experimental result on simulated low-resource setting for En-De/De-En shows ~3.0 BLEU improvement over back-translation
Details
-
Translation Data Augmentation
-
Result
Personal Thoughts
- You need LM and aligner to augment data
- augmentation focuses on rare words only, no diversity in sentence/semantics supported
- Not a good NLP augmentation method...
Link : https://arxiv.org/pdf/1705.00440.pdf
Authors : Fadaee et al. 2017