Classifier does not work, when text contains "contructor" as token.
Closed this issue · 5 comments
The problem is this line: https://github.com/ttezel/bayes/blob/master/lib/naive_bayes.js#L248
Naivebayes.prototype.frequencyTable = function (tokens) {
var frequencyTable = {}
tokens.forEach(function (token) {
if (!frequencyTable[token])
frequencyTable[token] = 1
else
frequencyTable[token]++
})
return frequencyTable
}
When token
is "constructor"
, frequencyTable[token]
is always true, because every object in Javascript natively has the constructor
property. Therefore frequencyTable[token]++
runs and this results in NaN
.
To fix this, we need to check for if (!frequencyTable.hasOwnProperty(token))
. We will overwrite the constructor
property, but we do not need it for the object anyway.
You can also do frequencyTable = Object.create(null)
instead, which should be faster. Also, it is cleaner than overwriting frequencyTable.constructor
.
Yes you are right. However, I think we should do both.
This has been fixed in bayes v0.0.5. Run npm update twit
to get it! Thanks.
You need to apply this also to this.vocabulary
, this.docCount
, this.wordCount
, this.wordFrequencyCount
and this.wordFrequencyCount[categoryName]
to be safe.
use a Map
instead, since es6 is a thing now