christhompson/loredb

Querying lore

Opened this issue · 2 comments

Ideally, the search functionality should be much richer. The current method (basic keyword search of the "lore" field, optional command line flags to search other fields) is clunky and hard to remember. A better method would be to allow user defined queries on the table. This issue is to track research and steps for designing and implementing this query support.

  • Define the query language. What kinds of queries should we support? What should the syntax be like?
  • Build/configure a parser for the query language.
  • Compile parsed query into a selection expression in our ORM.
  • Write tests???

Currently, the best bet I've found is to use a pre-existing query parser (maybe Whoosh), and then recursively compile the parse-tree into peewee Expression nodes.

Using Whoosh would force a specific query language/syntax (roughly like Lucene). Queries would look something like:

(term1 AND term2 OR author:chris) AND rating:>0.75

That's okay, but maybe not ideal. A more natural query language could be better, akin to:

(term1 and term2 or author=chris) and rating>0.75

To make a fully custom query language, we'd need to construct our own parser for it. It looks like the "recommended" simple parser library is pyparsing. The docs are not terribly great, and the included examples are in a random mishmash of licenses (or not licensed at all for reuse), which makes them kind of useless.

Related: searching for the blank author (loredb search -a '') doesn't actually work, since it just tries to match the blank string, which is always true.

The new query feature should be rich enough to specify "does not have field X".