This Rust crate is a binding for the sentencepiece unsupervised text tokenizer. The crate documentation is available online.
This crate depends on the sentencepiece
C++ library. By default,
this dependency is treated as follows:
- If
sentencepiece
could be found withpkg-config
, the crate will link against the library found throughpkg-config
. Warning: dynamic linking only works correctly with sentencepiece 0.1.95 or later, due to a bug in earlier versions. - Otherwise, the crate's build script will do a static build of the
sentencepiece
library. This requires thatcmake
is available.
If you wish to override this behavior, the sentencepiece-sys
crate
offers two features:
system
: always attempt to link to thesentencepiece
library found withpkg-config
.static
: always do a static build of thesentencepiece
library and link against that.