Fuzzy field-based search with multiple terms
fliepeltje opened this issue · 2 comments
Reading through the docs and the source code it seems like you can specify which fields you can search a specific term for, so you can issue a query like:
{
"query": [
{"term": {"ctx": "Harry Potter", "fields": ["role"]}, "occur": "must"},
{"term": {"ctx": "Daniel Radcliffe", "fields": ["actor"]}, "occur": "must"}
]
}
It would be really neat if it were possible to do the same for fuzzy queries so that something like this would be possible:
{
"query": [
{"fuzzy": {"ctx": "Barry Potter", "fields": ["role"]}, "occur": "must"},
{"fuzzy": {"ctx": "Daniel Radclif", "fields": ["actor"]}, "occur": "must"}
]
}
To put this in a little more perspective in terms of a use case, suppose documents of movies, actors, and roles. I might have heard that the lead character of a movie is called Harry Potter
but have no idea what movie this character belongs in, but I do want to know who the actor is.
If I were to create an index with the fuzzy method, I could create an index across all 3 fields, but when I search for Harry Potter
I will get a bunch of results of actors on account of the movie being called Harry potter and the ...
Alternatively I could create separate indexes for each of these and search the individual index, but then I run into the problem that once I do have more information (like say movie name or actor name), I would have to compute a likelihood score myself from results of searching multiple indexes.
This is a good idea, I'll probably mark this as something to target for 0.10 as I want to stabilize stuff for the 0.9 release.
I plan on making the query system a bit more configurable and more intuitive but right now the current query system makes it a little bit limiting to do without creating a mess in the code base.
I forgot to close this when merging the PR, but this has been added to master now 👍