/UnsupervisedPBSMT

Unsupervised Phrase-Based Statistical Machine Translation

This is unfinished code and currently on hold.

Unsupervised Phrase-Based Statistical Machine Translation

Unsupervised learning for machine translation systems. In particular, these approaches are valuable for low-resource languages that have no parallel sentences available.

Third-Party Software

  • FastText (learning word embeddings)
  • MUSE (unsupervised learning of bilingual dictionary)
  • Moses (phrase-based language model)
  • KenLM (learning smoothed n-gram models)

Installation

git clone --recursive git@github.com:tomrunia/UnsupervisedPBSMT.git

References

  • Conneau, Alexis, et al. "Word translation without parallel data." ICLR 2018.
  • Lample, Guillaume, et al. "Unsupervised Machine Translation Using Monolingual Corpora Only." ICLR 2018.
  • Lample, Guillaume, et al. "Phrase-Based & Neural Unsupervised Machine Translation." arXiv 2018.