Prior probability includes word frequencies?
Opened this issue · 0 comments
muety commented
This is rather a question than an actual issue, but anyway.
First, did I get it right that the prior probability P(C_j)
of a class is the number of document within that class, divided by the total number of documents?
And if so, why does the getPriors()
function set the prior prob. of a class C to the number of words in documents of that class (classData.Total
) divided by the total number of words? I'd expect that for the prior prob, words don't play any role, yet.
Probably I have a problem in understanding, so please try to enlight me.