chainer implementation of transformer for chatbot.
this implementation considers the diversity of model's ouput.
- README.md
- transformer.py
- tokenizer.py
- utils.py
- layers
- init.py
- decoder.py
- encoder.py
- feed_forward_layer.py
- layer_normalization_3d.py
- multi_head_attention.py
- dataset
- training code
- folder "./dataset"
- ./dataset/trained_model.model
- sentencepiece tokenizer's model file
- ./dataset/trained_model.vocab
- sentencepiece tokenizer's vocab file
- ./dataset/piece_frequency.txt
- token frequency file
- one line contains the frequency of the token in the same line with .vocab file
- if the token "hello" appears 300 times, "300" is in the "hello line".
- if you set the "augmentation" arg to True in TransformerConfig class, you should take some steps like "calculate expected value".
- anaconda3-5.2.0
- chainer 5.2.0
- numpy 1.15.1
- sentencepiece 0.1.82
- Attention Is All You Need
- SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
- Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
- Another Diversity-Promoting Objective Function for Neural Dialogue Generation
- On Layer Normalization in the Transformer Architecture