jbrukh/bayesian

Prior probability includes word frequencies?

Opened this issue · 0 comments

muety commented

This is rather a question than an actual issue, but anyway.

First, did I get it right that the prior probability P(C_j) of a class is the number of document within that class, divided by the total number of documents?

And if so, why does the getPriors() function set the prior prob. of a class C to the number of words in documents of that class (classData.Total) divided by the total number of words? I'd expect that for the prior prob, words don't play any role, yet.

Probably I have a problem in understanding, so please try to enlight me.