help to understand bpe logic
BogdanDidenko opened this issue · 2 comments
BogdanDidenko commented
Hello. Sorry, but i can't understand how this function work. In my tests in most cases the result is equal to original token parameter value.
https://github.com/openai/finetune-transformer-lm/blob/master/text_utils.py#L49
hohoCode commented
Same question.
thomwolf commented
Hi, so all these bpe logic is taken from Sennrich's work.
For more information you should:
- read Sennrich paper on bpe: http://www.aclweb.org/anthology/P16-1162
- and/or check out his code: https://github.com/rsennrich/subword-nmt
Related and a bit more recent: https://github.com/google/sentencepiece