Natural language detection for Rust. Documentation.
- Supports 84 languages
- 100% written in Rust
- No external dependencies
- Fast
- Recognizes not only a language, but also a script (Latin, Cyrillic, etc)
Add to you Cargo.toml
:
[dependencies]
whatlang = "0.3.1"
Small example:
use whatlang::{detect, Lang, Script};
// Detect Esperanto (there are also `detect_lang` and `detect_script` functions)
let info = detect("Ĉu vi ne volas eklerni Esperanton? Bonvolu!").unwrap();
assert_eq!(info.lang, Lang::Epo);
assert_eq!(info.script, Script::Latin);
You can create configured detector to apply blacklist or whitelist:
use whatlang::{Detector, Lang};
const WHITELIST : &'static [Lang] = &[Lang::Eng, Lang::Rus];
// You can also create detector using `with_blacklist` function
let detector = Detector::with_whitelist(WHITELIST);
// There are also `detect` and `detect_script` functions
let lang = detector.detect_lang("There is no reason not to learn Esperanto.");
assert_eq!(lang, Some(Lang::Eng));
For more details, please check documentation.
cargo bench
Support about 100 languages (actually at the moment it's 84)Allow to specify blacklist for QueryAllow to specify whitelist for QuerySupport new APIWrite doc for public structures and functionsImprove README exampleImplement benchmarksTune performanceCreate examples- Provide some metrics about reliability(confidence) in
Info
struct
MIT
- Thanks Franc JS for trigrams dataset.