Question about useFast=False when loading PhoBERT Tokenizer

Question

Question about useFast=False when loading PhoBERT Tokenizer

ithieund opened this issue 2 years ago · 2 comments

Hi @datquocnguyen ,
I remember you mentioned in some documents that when we use transformer v4, we should add useFast=False when loading the Tokenizer.
Is that still true for now? As I couldn't find that document again and your latest README seems to be updated too.

Another question is: what is the difference between useFast = True and useFast = False in this case? Anything change in the output?
Thank you very much.

Answer 1 · 2022-11-26T10:31:34.000Z

"useFast = True" and "useFast = False" produce the same output tokenization.
"useFast = True" (by default) would help run the latest examples in https://github.com/huggingface/transformers/tree/main/examples/pytorch and the likes with the fast PhoBERT Tokenizer (installation shown in the readme), while "useFast = False" will be used for examples available in https://github.com/huggingface/transformers/tree/main/examples/legacy

Answer 2 · 2022-11-26T13:01:27.000Z

Thank you.

On Sat, Nov 26, 2022 at 5:31 PM Dat Quoc Nguyen ***@***.***> wrote: "useFast = True" and "useFast = False" produce the same output tokenization. "useFast = True" (by default) would help run the latest examples in https://github.com/huggingface/transformers/tree/main/examples/pytorch and the likes with the fast PhoBERT Tokenizer (installation shown in the readme), while "useFast = False" will be used for examples available in https://github.com/huggingface/transformers/tree/main/examples/legacy — Reply to this email directly, view it on GitHub <#43 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABAFS22CGRKIHEXRLFA6YETWKHRJDANCNFSM6AAAAAASL4O5TI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- Sent from my iPhone