/tokenizer

General tokenizer library for the Web and Node. Supports Huggingface and Tiktoken formats

Primary LanguageRustBSD 2-Clause "Simplified" LicenseBSD-2-Clause

Tokenizer

Work in progress.

Installation from git requires rustc and cargo to be available and and the wasm32-wasi target to be installed.