Support for GPT-3.5 Tokenizer

Question

Support for GPT-3.5 Tokenizer

iRambax opened this issue 2 years ago · 0 comments

Hello, thank you for creating the ChatGPTSwift library. I noticed that the tokenizer currently used is the BPE tokenizer for ChatGPT-3, which is different from the Unigram language model tokenizer used by GPT-3.5.

Since we need to manually count the used tokens in stream mode, I was wondering if there is a plan to implement the GPT-3.5 tokenizer in the ChatGPTSwift library.

Thank you for your consideration.