guillaume-be/rust-tokenizers
Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models
RustApache-2.0
Stargazers
- AlexMikhalev@applied-knowledge-systems
- aloosleyGAIA Technologies GmbH
- connorskeesFigma
- D1mon
- daaku
- dehoyosbBlue Orange Digital
- digsy89Yogiyo
- diversableBC, Canada
- dlukesAgnostix, s.r.o.
- dtsbourgNew York City, NY
- entn-atPortland, Oregon
- floschnellData:Lab
- FohlenTübingen
- hscspring
- jbowles@ailgroup
- josrznGrenoble, France
- Kerollmops@meilisearch
- loretoparisi@Musixmatchdev
- MansterteddyMicrosoft
- MarinPostmaturso
- matiu2
- messenseHong Kong
- mladvladimirBelgrade, Serbia
- nklein23@pachama
- o1iv3rMunich
- pandaplusplus
- pbarkerBoulder, CO
- piaoger
- proyconKNAW Humanities Cluster & CLST, Radboud University
- sangjeedondrubAmdo
- sarthakTUM
- simonefrancia@musixmatch @Musixmatchdev
- stanislav-tkach
- tmabraham
- tuxcanfly@bcoin-org @handshake-org
- vinhvd749