Keyword detection: at the moment hardcoded value set at 3 words
Closed this issue · 1 comments
qmeeus commented
2 options are provided: ratio
(value between 0 and 1) and words
(integer).
- If words is too big, we get an IndexError
- If words is too small we are not able to get relevant results (bc keywords can be ngrams but words are always unigrams)
What we can do:
- Use
ratio
= 1.0 and trim the list returned to have a length that suits us (ratio
is the proportion of results to return from the keywords that have been found)
How do we know how many keywords to keep if we don't know the text beforehand?
- We can use the score that is returned alongside the keywords