silentbicycle/glean

Non-text file indexing support

Opened this issue · 0 comments

Set up configuration hooks* for non-text files that nonetheless can be meaningfully indexed: Pass .mp3s through id3tag, PDFs through ps2ascii, .docs through antiword, etc., and index the output. (Optionally, cache it.)

  • Idea: Call a script that reads a filename, returns "IGNORE", "OK", or a command line to pass it through. Include an example awk script with reasonable defaults.