This is unfinished code and currently on hold.
Unsupervised learning for machine translation systems. In particular, these approaches are valuable for low-resource languages that have no parallel sentences available.
- FastText (learning word embeddings)
- MUSE (unsupervised learning of bilingual dictionary)
- Moses (phrase-based language model)
- KenLM (learning smoothed n-gram models)
git clone --recursive git@github.com:tomrunia/UnsupervisedPBSMT.git
- Conneau, Alexis, et al. "Word translation without parallel data." ICLR 2018.
- Lample, Guillaume, et al. "Unsupervised Machine Translation Using Monolingual Corpora Only." ICLR 2018.
- Lample, Guillaume, et al. "Phrase-Based & Neural Unsupervised Machine Translation." arXiv 2018.