KL-Divergence Implementation does not handle 0 probabilities
carlosparadis opened this issue · 3 comments
carlosparadis commented
When executing createJSON, the following error will be thrown:
Error in stats::cmdscale(dist.mat, k = 2) : NA values not allowed in 'd'
I traced it down to:
Lines 298 to 304 in 51bb51e
To reproduce the issue:
Reproducible dataset
x <- c(0.2,0.3,0.3)
y <- c(0.2,0.3,0.4)
b <- c(0.2,0.3,0)
Using LDAvis
implementation shown at the start of this issue:
> jensenShannon(x=x,y=y)
[1] 0.003583677
> jensenShannon(x=x,y=b)
[1] NaN
The same test, using cosine
function from lsa
package:
> cosine(x=x,y=y)
[,1]
[1,] 0.9897595
> cosine(x=x,y=b)
[,1]
[1,] 0.7687061
rnkazman commented
This seems like an implementation detail, not a principled reason to use one or the other. Is that correct?
caitsimop commented
Hi there, I'm still getting this error in v0.3.5. Is this the most up to date version?