cdipaolo/goml

Why do Predict and Probability functions use different operators?

Closed this issue · 2 comments

I understand why the Naive Bayes "Predict" function uses a math.Log() to avoid an underflow. I don't understand why on lines 288 and 293 the operator is += instead of *=... Could you provide an explanation? Maybe an update to the docs?

Hi Lane,

Thanks for the important question:

In the model we would like to compute the product p1 * p2 * ... * pn of a bunch of probabilities. As you know, this has serious underflow problems, so instead we compute log(p1 * p2 * ... * pn) to get around this. But based on the basic log-property that log(a*b) = log(a) + log(b), this says that we can just compute log(p1 * p2 * ... * pn) = log(p1) + log(p2) + ... * log(pn), which is exactly what we do.

Does that help?