elixir-haystack/haystack

Query customization

densefog opened this issue · 5 comments

I'm enjoying Haystack and wondering if there is a way to customize how queries are run by field.

For example, say I have a list of books in a JSON document and each entry has the book title and a book category. Is there currently a way to specify the exact 'category' and have the search run across the title? Such as, give me all the books in the 'Elixir' category that have 'ecto' in them.

I tried creating a separate index and customizing tokens but didn't find an obvious way? Thanks!

Hey @densefog, would the following work?:

alias Haystack.{Query, Tokenizer, Transformer}

# I'm assuming you already have a %Haystack{} struct and the query string
# is the `q` variable

Haystack.index(haystack, :books, fn index ->
  tokens = Tokenizer.tokenize(q)
  tokens = Transformer.pipeline(tokens, Transformer.default())
  
  expressions = [
    Query.Expression.new(:match, field: "category", term: "elixir"),
    Enum.map(tokens, &Query.Expression.new(:match, field: "title", term: &1))
  ]
  
  Query.new()
  |> Query.clause(Query.Clause.expressions(Query.Clause.new(:all), expressions))
  |> Query.run(index)
end)

You can build your own %Query{} and pass it directly to Query.run/2.

Hi @philipbrown, I think that's definitely on the right track. It gave me an error with the nested expressions list so I put them on the same level, not sure that's right?

I was attempting to use the data from the live-view example with something like this but must have something off since nothing is returned. I'll see if I can dig further into it.

alias Haystack.{Query, Tokenizer, Transformer}

Haystack.index(App.Articles.Search.haystack(), :articles, fn index ->
  tokens = Tokenizer.tokenize("giant")
  tokens = Transformer.pipeline(tokens, Transformer.default())
  
  expressions = [
    Query.Expression.new(:match, field: "name", term: "panda")
  ] ++ Enum.map(tokens, &Query.Expression.new(:match, field: "body", term: &1))

  Query.new()
  |> Query.clause(Query.Clause.expressions(Query.Clause.new(:all), expressions))
  |> Query.run(index)
end)

I also modified app/articles/search.ex for the load function to use name instead of title:

|> Stream.map(&Map.take(&1, ~w{id name body}a))

Ahh, Yes you're right. Sorry, that was my mistake. The perils of writing code without testing it!

It should be:

alias Haystack.{Query, Tokenizer, Transformer}

Haystack.index(App.Articles.Search.haystack(), :articles, fn index ->
  tokens = Tokenizer.tokenize("giant")
  tokens = Transformer.pipeline(tokens, Transformer.default())
  
  expressions = [
    Query.Expression.new(:match, field: "name", term: "panda")
  ] ++ Enum.map(tokens, &Query.Expression.new(:match, field: "body", term: &1.v))

  Query.new()
  |> Query.clause(Query.Clause.expressions(Query.Clause.new(:all), expressions))
  |> Query.run(index)
end)
-] ++ Enum.map(tokens, &Query.Expression.new(:match, field: "body", term: &1))
+] ++ Enum.map(tokens, &Query.Expression.new(:match, field: "body", term: &1.v))

Also, good spot on the title -> name error. I've fixed that here: elixir-haystack/phoenix-live-view-example@36e1af4

Let me know how you get on!

Yes, that does appear to be working better. In my testing though I did run into some results I thought were random but it appears to be something with the Stemmer timing. For example, I was trying to search for the word "company" and it wasn't finding any results when expected. I changed the search to the stemmed word "compani" and results were found. I'm sure that's a completely separate issue from this one. Thanks!

@densefog That's great! 🙌 Yeah, that sounds like a separate issue. Feel free to open a new issue with your findings, or any suggestions you have for improvement!