microsoft/Tokenizer

Example of how to pre-download the BPE rank file

KyleMit opened this issue · 2 comments

The ReadMe states:

In production setting, you should pre-download the BPE rank file and call TokenizerBuilder.CreateTokenizer API to avoid downloading the BPE rank file on the fly.

Can there be additional documentation / examples as to how one might go about doing that?

Yes, the locations to download the BPE rank file are defined in

public static async Task<ITokenizer> CreateByEncoderNameAsync(string encoderName, IReadOnlyDictionary<string, int>? extraSpecialTokens = null)

I updated readme.md with the information.