How should Nucleo work?
Closed this issue · 2 comments
Thanks for creating the fuzzy library.
I encounter a weird problem for Nucleo struct.
For the following code which you can run on rust-explorer
use std::sync::Arc;
use nucleo::Nucleo;
use nucleo::pattern::{CaseMatching, Normalization};
fn main() {
let mut matcher = init_fuzzy_matcher();
let inject = matcher.injector();
let list = ["foobar", "fxxoo", "oo", "a"];
list.iter().for_each(|s| {
inject.push(s, |_| {});
});
matcher
.pattern
.reparse(0, "f", CaseMatching::Ignore, Normalization::Smart, false);
let _status = matcher.tick(1000);
dbg!(matcher.pattern.column_pattern(0));
let mut counter = 0;
loop {
let _status = matcher.tick(100);
// if status.changed {
let snapshot = matcher.snapshot();
let total = snapshot.item_count();
let got = snapshot.matched_item_count();
let res: Vec<_> = snapshot
.matched_items(..)
.map(|item| item.data)
.collect();
dbg!(total, got, res);
// }
// if !status.running {
// break;
// }
println!("running");
if counter > 4 {
break;
}
counter += 1;
}
}
type Matcher = Nucleo<&'static str>;
fn init_fuzzy_matcher() -> Matcher {
Nucleo::new(
nucleo::Config::DEFAULT,
Arc::new(|| println!("notified")),
None,
1,
)
}
The res
is always empty:
[src/main.rs:34:9] total = 4
[src/main.rs:34:9] got = 0
[src/main.rs:34:9] res = []
By using nucleo::Matcher
, for the same config, input and needle string, there is the desired output.
use nucleo::pattern::{Atom, AtomKind, CaseMatching, Normalization};
use nucleo::Matcher;
fn main() {
let mut matcher = init_fuzzy_matcher();
let list = ["foobar", "fxxoo", "oo", "a"];
let res = Atom::new(
"f",
CaseMatching::Ignore,
Normalization::Smart,
AtomKind::Fuzzy,
false,
)
.match_list(&list, &mut matcher);
dbg!(res);
}
fn init_fuzzy_matcher() -> Matcher {
Matcher::new(nucleo::Config::DEFAULT)
}
[src/main.rs:20:5] res = [
(
"foobar",
36,
),
(
"fxxoo",
36,
),
]
So the question is how we use Nucleo in the right way? I see an issue asking for examples, but no replies in there.
I also scan the code in helix's source files, though nucleo is used as its dependency, the real use of it is Matcher, not Nucleo.
Well, I think the problem is from
// Injector<T>
pub fn push(
&self,
value: T,
fill_columns: impl FnOnce(&mut [Utf32String])
) -> u32
I didn't use fill_columns
to add the source string to the search list because I mistakenly think value: T
is like T
in Atom/Pattern:
// Atom/Pattern
pub fn match_list<T>(
&self,
items: impl IntoIterator<Item = T>,
matcher: &mut Matcher
) -> Vec<(T, u16)>
where
T: AsRef<str>,
Actually, I indeed noticed Injector<T>
lacks AsRef<str>
bound, and was wondering from where the matcher knows the string source. Now I understand T
on Injector<T>
and Pattern::match_list<T>
mean different things.
And Nucleo
is indeed what I need. Here's the working code:
// ...
let list = [
"foobar".to_owned(),
"fxxoo".to_owned(),
"oo".to_owned(),
"a long string".to_owned(),
];
for (idx, item) in list.iter().enumerate() {
inject.push(Idx(idx), |buf| {
dbg!(buf.len());
if let Some(buf) = buf.first_mut() {
*buf = item.as_str().into();
}
});
}
// ...
snapshot
.matched_items(..)
.map(|item| &list[item.data.0])
.collect();
The last thing I don't understand is why the argument in fill_columns
callback is &mut [Utf32String]
.
The last thing I don't understand is why the argument in
fill_columns
callback is&mut [Utf32String]
.
Hah, I just realized it's due to Nucleo::<T>::new(..., columns)
.
Nucleo can match items with multiple orthogonal properties. columns indicates how many matching columns each item (and the pattern) has. The number of columns can not be changed after construction.
I created a 1 column Nucleo<T>
, thus Injector<T>
should fill exactly 1 cloumn of Utf32String.