How should/does nucleo handle umlauts?

Question

How should/does nucleo handle umlauts?

Closed this issue 7 months ago · 2 comments

For example I notice that a needle ë fails to fuzzy match bë. On the other hand a needle e will match bë, and a needle ë will match a haystack ë.

let paths = ["be", "bë"];
let mut matcher = Matcher::new(Config::DEFAULT);
let matches = Pattern::parse("ë", CaseMatching::Ignore).match_list(paths, &mut matcher);
assert_eq!(matches.len(), 1); // fails

Is that expected or a bug? If expected can you say a bit more about why and suggested workarounds... mostly just so I can document to people using my app why it works the way that it does.

Thank you.

Answer 1 · 2023-12-17T12:04:26.000Z

You are using Config::DEFAULT which has normalization turned on.

pub const DEFAULT: Config = {
        Config{
            delimiter_chars: b"/,:;|",
            bonus_boundary_white: BONUS_BOUNDARY + 2,
            bonus_boundary_delimiter: BONUS_BOUNDARY + 1,
            initial_char_class: CharClass::Whitespace,
            normalize: true,
            ignore_case: true,
            prefer_prefix: false,}
    }

This will convert non ascii characters to ascii, so turning it off should solve this.

let mut conf = Config::DEFAULT;
conf.normalize = false;

Answer 2 · 2023-12-17T16:48:36.000Z

Thanks for your help, that solve the problem for me.