Find Unix (Linux / macOS) Alternative for Everything
Closed this issue · 6 comments
we need similar functionality (with any kind of api to call) to get paths of all images
I guess using os.walk
and glob
would be sufficient and portable, but you still need to prepare patterns to match all image extensions and find a way to monitor changes
I guess using
os.walk
andglob
would be sufficient and portable, but you still need to prepare patterns to match all image extensions and find a way to monitor changes
It should work I guess, and the pattern is not the problem
The main problem it would take too much time. Everything already handles indexing, monitoring changes and updating everything.
check this: https://github.com/cboxdoerfer/fsearch
although I'm in favor of building a more simple, os-independent pipeline
check this: https://github.com/cboxdoerfer/fsearch although I'm in favor of building a more simple, os-independent pipeline
ofc me too, the only os dependent part is Everything SDK.
I tried os.scandir
and it got all images in 3:30
minutes, not bad if it's done once. (not each time the server starts) and think about how to monitor changes. any ideas?
I'm thinking of adding this as basic setup (like HF transformers in models) and if anyone wants better options there's Everything or equivalent tools.
I'm really grateful for your help and your ideas ❤️❤️
check this: https://www.geeksforgeeks.org/create-a-watchdog-in-python-to-look-for-filesystem-changes/
also I found that using the native search functionality for each system is very fast so if you are willing to compromise, maybe use it
import subprocess
def find_files(directory, extensions):
cmd = f"find {directory} -type f \\( -iname '*.{extensions[0]}'"
for ext in extensions[1:]:
cmd += f" -o -iname '*.{ext}'"
cmd += " \\)"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
return result.stdout.splitlines()
files = find_files('mnt', ['jpg', 'png'])
great work btw
Amazing! I haven't known this already exists lol. I will give it a try 😉😉