Phrase search considering synonyms
Closed this issue · 3 comments
Would it be possible to use this matching library also for some smarter phrase search which would take into consideration spacy's word vectors?
For instance, if I create a matcher object like this:
import spacy
import spaczz
nlp = spacy.load('en_core_web_lg')
matcher = spaczz.matcher.FuzzyMatcher(nlp.vocab)
matcher.add("my_phrase", [nlp('humorous story')])
Then it would be maybe interesting to see also match for a sentence like in this example:
matcher(nlp('He told me a very funny story.'))
where there is a sub-phrase "funny story" which is a synonym to a phrase "humorous story" we added to the matcher.
Hi @Matt52, this is an interesting idea that I think could be implemented in a similar fashion to the way the fuzzy matcher is currently implemented but it would use spaCy's existing Span.similarity
method instead of fuzzy matching. I don't have a definite timeline on getting this implemented but I will definitely start working on it as time allows.
Hi @Matt52 sorry for the really slow development on my end. Between work and the stress of the recent election cycle here in the US I have not been very productive. I have picked this issue back up again and have about 1/2 the work done in the new branch enhancement-similaritysearch. Hoping to have a release ready in the next couple days.
There will likely be more changes as I continue working on enhancing spaczz in general but the initial enhancement should be enough to get you started.
Hi @Matt52 I am publishing a new release (v0.3.1) shortly and will close this issue accordingly. Please check out the new details about the SimilarityMatcher in the readme when the release happens and let me know if you have any additional questions/issues. Thanks!