VinAIResearch/PhoBERT

Question about useFast=False when loading PhoBERT Tokenizer

ithieund opened this issue · 2 comments

Hi @datquocnguyen ,
I remember you mentioned in some documents that when we use transformer v4, we should add useFast=False when loading the Tokenizer.
Is that still true for now? As I couldn't find that document again and your latest README seems to be updated too.

Another question is: what is the difference between useFast = True and useFast = False in this case? Anything change in the output?
Thank you very much.

"useFast = True" and "useFast = False" produce the same output tokenization.
"useFast = True" (by default) would help run the latest examples in https://github.com/huggingface/transformers/tree/main/examples/pytorch and the likes with the fast PhoBERT Tokenizer (installation shown in the readme), while "useFast = False" will be used for examples available in https://github.com/huggingface/transformers/tree/main/examples/legacy