bnosac/sentencepiece
R package for Byte Pair Encoding / Unigram modelling based on Sentencepiece
C++MPL-2.0
Issues
- 0
upgrade to sentencepiece 0.2.0
#9 opened by jwijffels - 1
clang UBSAN
#8 opened by jwijffels - 2
Need fixes in original sentencepiece C++ code to be able to be accepted on CRAN
#1 opened by jwijffels - 4
Pull out wordpiece_encode?
#7 opened by jonthegeek - 26
1-character wordpieces fail to encode
#4 opened by jonthegeek - 2
wordpiece_encode ids
#6 opened by jonthegeek - 1
add txt_remove_
#3 opened by jwijffels