Index and search the contents of your documents stored in Dropbox. Requires node.js and Solr, supports many document types and keyword shortcuts, and updates index as you add or edit files. Bundled with a simple web front end with snippets and ajax loading of results.
recipe | Search for text (case and stemming aware) |
"cake recipe" | Phrase search |
jam recipe in:Files | Match documents within a folder |
when:yesterday | All documents modified yesterday |
recipe when:2012 | Matches from year 2012 |
by:Mike | Match the given author |
dogs type:image | Return only images |
where:40"47' | Match lat/long in image metadata |
- Get a Dropbox API key.
- Set up and run your Solr instance.
- Use the included solr/schema.xml file, note lines marked
EDIT dropbox-search
- Edit environment variables as below.
npm install
to download dependencies (solr, dbox, express, dateformat).node indexer.js
to index your documents.node server.js
to launch the web app.- Browse to http://localhost:8888/search
DROPBOX_APP_KEY =
DROPBOX_APP_SECRET =
DROPBOX_UID =
DROPBOX_OAUTH_TOKEN =
DROPBOX_OAUTH_SECRET =
SOLR_HOST = 127.0.0.1
SOLR_PORT = 8983
ROOT_PATH = /
The code doesn't yet implement the oauth protocol, so you must do this manually and provide token and secret for now.
Dropbox-search uses ExtractingRequestHandler to index multiple file types, including: pdf, doc and docx (Word), xls (Excel), ppt, odt, csv, html, rtf, txt, and more. In addition to text content, it extracts metadata such as author and date. For image files, it extracts exif metadata like gps_latitude.
I also define some useful shortcuts like:
- when : matches a date (e.g. today, yesterday, year-mm-dd, year-mm, or year)
- type : matches a file type (e.g. image or rtf)
- in : matches files within the given folder or path fragment (e.g. MyFiles)
- by : same as author
- where : matches gps_latitude or gps_longitude
The indexer listens for Dropbox API delta events to fetch documents that need to be added or removed from the index.
Note: Dropbox may rate-limit excessive file fetches by returning 503 errors. I try to handle this by queueing file fetches to happen at most once per second.
File type icons © Dropbox Icon Library.