ajb2969/InformationRetrieval

Fix punctuation in index

Closed this issue · 3 comments

remove - and " from index, etc

Sort of related: should queries be stripped of punctation since our index is punctuation free?

The issue with removing punctuation from queries is then you end up with theres instead of there's, neither of which are contained in the index. If queries are to be stripped of punctuation then there should be a limited or finite set of punctuation removed that doesn't affect the query itself

This issue will be closed due to punctuation being removed using the java punctuation regex in the index