Support for loading a custom vocab.txt?
Closed this issue · 3 comments
BrainSlugs83 commented
I'm interested in being able to Tokenize text using a custom loaded vocab.txt
file (ala hugging face).
Is this possible with the current tokenizers? -- If not, is it something you would consider adding?
NMZivkovic commented
It is a nice idea.
I will add classes BertCasedCustom and BertUncasedCustom, which will in an essence expose CasedTokenizer and UncasedTokenizer respectively.
BrainSlugs83 commented
Awesome! 👍🏻
NMZivkovic commented
Two new classes are available in the new version. Check it out and let me know if this is working well.
I will close this issue and we can open a new one if the problems arise.
Thanks for the suggestion once again!