Denis2054/Transformers-for-NLP-2nd-Edition

Special Tokens not Provided

mediadepp opened this issue · 2 comments

Hi, I guess the following code in the provided notebook in Github has a problem.

tokenizer.train(files=paths, vocab_size=52_000, min_frequency=2, special_tokens=[
    "<s>",
    "<pad>",
    "</s>",
    "<unk>",
    "<mask>",
]

The tokens are not provided. You can find it under chapter four, section three.

lxt3 commented

I believe this was solved in the April 2023 update of the notebook, by pip installing --upgrade accelerate, as well as pip installing the latest transformers module.

Yes. Thank you. Hugging Face now requires an accelerator, the notebook was updated in April 2023. Thank you for explaining this.