How should/does nucleo handle umlauts?
Closed this issue · 2 comments
jessegrosjean commented
For example I notice that a needle ë
fails to fuzzy match bë
. On the other hand a needle e
will match bë
, and a needle ë
will match a haystack ë
.
let paths = ["be", "bë"];
let mut matcher = Matcher::new(Config::DEFAULT);
let matches = Pattern::parse("ë", CaseMatching::Ignore).match_list(paths, &mut matcher);
assert_eq!(matches.len(), 1); // fails
Is that expected or a bug? If expected can you say a bit more about why and suggested workarounds... mostly just so I can document to people using my app why it works the way that it does.
Thank you.
Tyarel8 commented
You are using Config::DEFAULT
which has normalization turned on.
pub const DEFAULT: Config = {
Config{
delimiter_chars: b"/,:;|",
bonus_boundary_white: BONUS_BOUNDARY + 2,
bonus_boundary_delimiter: BONUS_BOUNDARY + 1,
initial_char_class: CharClass::Whitespace,
normalize: true,
ignore_case: true,
prefer_prefix: false,}
}
This will convert non ascii characters to ascii, so turning it off should solve this.
let mut conf = Config::DEFAULT;
conf.normalize = false;
jessegrosjean commented
Thanks for your help, that solve the problem for me.