Fast and trainable tokenizer for natural languages relying on maximum entropy methods.
Primary LanguageC++OtherNOASSERTION