Transformer primitiv
Attention Is All You Need (Vaswani et al., NIPS'17) without beam search
it may work.
2 stack, 100000 iterations, softmax cross-entropy on WMT En-De → BLEU = 21.46 pt
Requirements
- Python >= 3.4
- primitiv/primitiv-python v0.3.0
- google/sentencepiece
- tqdm
- numpy
Usage
python main.py preproc [config file]
python main.py train [config file]
python main.py test [config file] > /path/to/output/file