/kitoken

Fast and versatile tokenizer for language models with BPE, Unigram and WordPiece tokenization. Compatible with SentencePiece, Tokenizers, Tiktoken and more.

Primary LanguageRustBSD 2-Clause "Simplified" LicenseBSD-2-Clause

No issues in this repository yet.