Tokenizer Choice?
risedangel opened this issue · 1 comments
risedangel commented
HEllo
Would it be possible to use tiktoken as a choice for tokenizer ?
(see: https://pypi.org/project/tiktoken/)
syncdoth commented
Sure. This code is not paired with pretrained weights (yet) so you can choose whichever tokenizer you want. The only thing to care about in terms of model is about setting the correct pad_token_id
, eos_token_id
, and vocab_size
in the RetNetConfig
.