dotnet/ai-samples

Update Phi tokenizer implementation to use Microsoft.ML.Tokenizers

luisquintanilla opened this issue · 1 comments

Work on CodeGen tokenizer needed for Phi-2 will be complete in Microsoft.ML.Tokenizers in the next few days.

Once it's available, consider replacing the current tokenizer implementation with the one provided by Microsoft.ML.Tokenizers.

https://github.com/dotnet/ai-samples/blob/main/src/local-models/Phi/Tokenizer.cs

For the Phi-3 sample, use the existing LlamaTokenizer in Microsoft.ML.Tokenizers