/minbpe.c

a Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization in pure C.

Primary LanguageC

Stargazers