soaxelbrooke/phrase
A tool for learning significant phrase/term models, and efficiently labeling with them.
RustApache-2.0
Issues
- 0
- 1
Duplicated tokens when using MAX_NGRAM > 10
#22 opened by soaxelbrooke - 0
Try Different PMI Estimate
#21 opened by soaxelbrooke - 3
Saturates memory on large corpus
#18 opened by jjhbw - 0
Ignore numbers and emails
#15 opened by soaxelbrooke - 1
Make label names more permissive
#17 opened by soaxelbrooke - 2
Add /count API endpoint
#14 opened by soaxelbrooke - 0
Add fit-transform style CLI option
#13 opened by soaxelbrooke - 1
Performance Regression
#12 opened by soaxelbrooke - 3
- 0
- 1
Calculate stems during export
#11 opened by soaxelbrooke - 0
Make thread per label being counted
#8 opened by soaxelbrooke - 0
Add stemming and canonical form mapping
#4 opened by soaxelbrooke - 0
Keep ngram counts in sorted order
#9 opened by soaxelbrooke - 0
Add "transform" subcommand
#5 opened by soaxelbrooke - 0
Implement vocab pruning
#7 opened by soaxelbrooke - 0
Add environment variables for parameters
#6 opened by soaxelbrooke - 0
`phrase` doesn't print usage
#1 opened by soaxelbrooke - 0