/fast-bpe

Fast BPE algorithm to generate byte pair encodings from text corpus, it's written in rust and approximately 20x faster than it's python implementation

Primary LanguageRust

Fast Byte Pair Encoding

This contains the bpe_train which trains on the text corpus to generate byte pairs. It's approximately 20x faster than the version written in python!!.