/tree2seq

C++ code of "Tree-to-Sequence Attentional Neural Machine Translation (tree2seq ANMT)"

Primary LanguageC++

Tree2Seq: Tree-to-Sequence Attentional Neural Machine Translation

We have proposed a novel syntactic ANMT model, "Tree-to-Sequence Attentional Neural Machine Translation" [1]. We extend an original sequence-to-sequence model [2] with the source-side phrase structure. Our model has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with source phrases and words. Here is an online demo of Tree2Seq.

Description

C++ codes of the syntactic Attention-based Neural Machine Translation (ANMT) model.

  1. AttentionTreeEncDec.xpp: our ANMT model, "Tree-to-Sequence Attentional Neural Machine Translation"
  2. AttentionEncDec.xpp: Baseline ANMT model [3]
  3. /data/: Tanaka Corpus (EN-JP) [4]

Requirement

Usage

  1. Modify the paths of EIGEN_LOCATION, SHARE_LOCATION and BOOST_LOCATION. See Makefile.
  2. $ bash setup.sh
  3. $./tree2seq (Then, training the AttentionTreeEncDec model starts.)
  4. Modify main.cpp if you want to change the model.

(!) Attention: I prepare a small corpus of Tanaka corpus. You need over 100,000 parallel corpus.

Citation

Contact

Thank you for your interests. If you have any questions and comments, feel free to contact us.

  • eriguchi [.at.] logos.t.u-tokyo.ac.jp
  • hassy [.at.] logos.t.u-tokyo.ac.jp