Fast BPE algorithm to generate byte pair encodings from text corpus, it's written in rust and approximately 20x faster than it's python implementation
Rust
Fast Byte Pair Encoding
This contains the bpe_train which trains on the text corpus to generate byte pairs. It's approximately 20x faster than the version written in python!!.