Special Tokens not Provided
mediadepp opened this issue · 2 comments
mediadepp commented
Hi, I guess the following code in the provided notebook in Github has a problem.
tokenizer.train(files=paths, vocab_size=52_000, min_frequency=2, special_tokens=[
"<s>",
"<pad>",
"</s>",
"<unk>",
"<mask>",
]
The tokens are not provided. You can find it under chapter four, section three.
lxt3 commented
I believe this was solved in the April 2023 update of the notebook, by pip installing --upgrade accelerate, as well as pip installing the latest transformers module.
Denis2054 commented
Yes. Thank you. Hugging Face now requires an accelerator, the notebook was updated in April 2023. Thank you for explaining this.