nspragg/filehound

Slow search

borekb opened this issue · 5 comments

Hi, I'm trying to use FileHound to find package.json files in a directory tree with about 80,000 files (most of them in node_modules which I'm excluding, see below) and it runs much slower than find. I'm not sure if I'm doing something wrong or if FileHound's implementation could be improved. This is what I'm seeing:

FileHound, taking around 18 seconds on average:

packageJsons = await FileHound.create()
    .path(rootDir)
    .discard('node_modules')
    .match('package\.json')
    .find();

find (actually the "slower" find as I'm on Windows and this is Git for Windows' find), taking around 2 seconds:

await execa.shell(`find . -path "*node_modules*" -prune -o -name package.json -print`);

Any ideas?

Hi @borekb.

I'm sure some performance improvements could be made to the lib. I'm looking to write a v2.0.0 within the next couple of weeks, and performance improvements are in scope. However, given this is a node lib, it's performance will never be comparable to that of find (written in C).

Is the code you're writing a node lib/app?

If not, for one off/ad hoc scripts, that involving more intensive searches, I'd recommend using the OS search utils directly e.g via shell scripts (on Linux/Unix)

Thanks for the response. Does my configuration look right? Maybe I'm calling FileHound in a way that it traverses the whole tree, I'm not sure (my guess would be that the efficiency of filesystem traversal has bigger impact than C vs. JS).

Yes, the query you're making looks fine. However, can you further reduce the search space by adding more exclusions? e.g other paths you definitely don't wish to search or limit the recursion depth?

This is about as much limiting as I can afford, 2 seconds is not that bad but out of curiosity, I've tried ripgrep which also searches file contents so should be slower in theory but it gives me correct results in 0.05 seconds (!) so I'm going to go with it.

Still, FileHound is a nice JS-only solution, thanks for making it available.

Will be releasing golang and rust versions in the near future :)