Conducting searches with quotes yields results that don't include transcript snippets
caseyedavis12 opened this issue · 5 comments
Describe the bug
When a user conducts a search with quotes around the search term, retrieved results do not show where those keywords appear in transcripts. If you do the same search without quotes, search results show transcript snippets.
Expected behavior
Searches for terms in quotes should include results that show transcript snippets.
Looking into this:
When searching for "ice cream"
:
https://americanarchive.org/catalog?f%5Bspecial_collections%5D%5B%5D=backstory&per_page=10&q=%22ice+cream%22&utf8=%E2%9C%93&f[access_types][]=online
GET /snippets.json?
The request form data is being sent as such:
Form Data
ids[]: cpb-aacip-532-3775t3h512
ids[]: cpb-aacip-5a4ab819222
ids[]: cpb-aacip-532-t43hx1758w
ids[]: cpb-aacip-af60440fe9c
ids[]: cpb-aacip-cca5af185c1
ids[]: cpb-aacip-532-c53dz0493w
ids[]: cpb-aacip-532-h41jh3fc4q
ids[]: cpb-aacip-532-n29p26rf2k
ids[]: cpb-aacip-532-jq0sq8rs78
ids[]: cpb-aacip-41541e17829
query: ""
So the quotes are not properly sent from the search form.
JS / ERB
AAPB2/app/views/catalog/index.html.erb
Lines 45 to 50 in 8aebfd3
The double quotes around the @query
are breaking when the erb templating happens.
Tests
ice cream
=> "ice cream"
"ice cream"
=> """"
"ice cream
=> 500 error
'ice cream
=> "'ice cream"
'ice cream'
=> "'ice cream'"
Escaping
Tried many combinations, including:
- Escaping with single quotes
'
- Escaping with backticks
`
- Unescaping
- Using
raw()
- Using
html_escape()
- Multiple of the above
- Several others I've forgotten
So far, none yield the expected results.
catalog_controller
AAPB2/app/controllers/catalog_controller.rb
Lines 195 to 197 in 8aebfd3
With pry:
> params
=> {"utf8"=>"✓", "f"=>{"access_types"=>["online"]}, "per_page"=>"100", "q"=>"\"ice cream\"", "controller"=>"catalog", "action"=>"index"}
> params[:q]
=> "\"ice cream\""
params[:q].dup
=> "\"ice cream\""
@query
=> "\"\""
@terms_array
=> [["ICE", "CREAM"]]
Questions
- How is
@query
empty ifparams[:q].dup=> "\"ice cream\""
? - How did
@terms_array
get[["ICE", "CREAM"]]
from that?
Some answers
1+2. @query
is "\"ice cream\""
before line 197, and "\"\""
afterwards.
More questions
- ??
query_to_terms_array
After pairing with Drew, we decided this function is confusing.
AAPB2/app/helpers/application_helper.rb
Lines 13 to 41 in 8aebfd3
Current steps:
-
Read the list of stopwords
-
if query contains quotes:
- Get quoted terms
- Remove quoted terms from query
- Remove punctuation from remaining query
- Uppercase remaining query
- Remove stop words from remaining query
- Flatten quoted terms
- Add remaining cleaned query terms to quoted terms
-
Else:
- Split query by space
- Remove stopwords
-
For each term:
- Uppercase
- Strip whitespace
- Remove all non word, non space characters
-
Return query terms array