proycon/colibri-core

Wrong threshold in model.filter

svetlana21 opened this issue · 3 comments

Hello!
In this command options = colibricore.PatternModelOptions(mintokens=50, maxlength=6, doskipgrams=True) I set mintokens=50. But then I tried to extract skipgrams with a command self.model.filter(0, colibricore.Category.SKIPGRAM)
Results look like threshold was 100 (I don't see any skipgram with occurence less than 100). Is it a bug or do I something wrong?

When I set mintokens=25 I get skipgrams with occurence 50

Your understanding is right, you should indeed get all skipgrams that occur 50 times or more, provided that you uses the same threshold at training time (and not a higher threshold). Might that be in the case perhaps?

(closing this issue due to no activity in a long time, please reopen if still relevant)