/bpetokenizer

(py package) train your own tokenizer based on BPE algorithm for the LLMs (supports the regex pattern and special tokens)

Primary LanguageJupyter Notebook