syncdoth/RetNet

Tokenizer Choice?

risedangel opened this issue · 1 comments

HEllo
Would it be possible to use tiktoken as a choice for tokenizer ?
(see: https://pypi.org/project/tiktoken/)

Sure. This code is not paired with pretrained weights (yet) so you can choose whichever tokenizer you want. The only thing to care about in terms of model is about setting the correct pad_token_id, eos_token_id, and vocab_size in the RetNetConfig.