Issues
- 2
Disable HMM feature of Jieba
#136 opened - 4
Compile/Instal Charabia on openBSD
#118 opened - 1
Add support for Thai
#113 opened - 1
- 3
Explain the name of the repo in the README
#102 opened - 0
Add support for Hebrew
#100 opened - 9
- 2
- 1
- 1
Make Latin Segmenter split on `'`
#90 opened - 1
Reimplement Japanese Segmenter
#89 opened - 1
- 1
- 1
- 0
Decompose Japanese compound words
#74 opened - 0
Tokenizer refactoring strategy
#72 opened - 1
Support for Tatar language
#68 opened - 1
Chinese highlight
#65 opened - 0
- 7
- 13
Get size of char after normalization
#54 opened - 1
Change the crate name
#51 opened - 0
- 0
Add bors
#40 opened - 2
Handle non-breakable spaces
#38 opened - 10
Wrong matching for Arabic
#36 opened - 3
Publish tokenizer to crate.io
#35 opened - 1
- 1
Introduce HTML tags separators
#33 opened - 3
Tokenizer for Ja/Ko
#30 opened - 1
Project naming question
#25 opened - 0
Add Actual Tokenizer state
#2 opened - 3
Rework meilisearch tokenizer
#1 opened