Discarding keywords/-phrases with certain POS tag
Closed this issue · 2 comments
Dear authors,
Firstly, I want to thank you for the great work you're doing!
I wonder what would be the best practice do detect and discard keyphrases that are (not) of a specific POS tag when using YAKE.
More specifically, I need to discard names of people and numbers for a project. I could do that after YAKE extracted them from my corpus, but I assume it would be more efficient to not even build/include the key phrases when they're of a specific POS tag.
Thanks in advance for any hints/ideas!
Dear Lisa,
Sorry for my late reply. Busy days.
Thanks for your kind words and for using YAKE!
If I understood your question you are interested in discarding names of people (even if they are detected as a keyphrase) by YAKE. While you could do it directly on YAKE (by making a branch and adapting its code) it will probably make more sense to add a further post-processing layer where a POS filter (such as Spacy) can be applied.
Best
Ricardo
Dear Ricardo,
many thanks for your reply and suggestions. I indeed went for the latter approach!
Best,
Lisa