dmeoli/WS4J

Error in similarity calculation

jairofsouza opened this issue · 2 comments

Hi @DonatoMeoli
Thank you for your code!
I'm using the class SimilarityCalculationDemo and I noticed it returns 0 in all similarity algorithms for some words. For example, I've tested with (car, vehicle) and (cancer, disease).

I guess the problem is at
https://github.com/DonatoMeoli/WS4J/blob/008a427123a9f25106ea81495eb256976658dd1f/src/main/java/edu/uniba/di/lacam/kdde/ws4j/util/WordSimilarityCalculator.java#L110

It's necessary adding the code:
double score = relatedness.getScore(); if (score > maxScore) maxScore = score;

Do you think it is correct?

And the WuPalmer metric gets an error using (cancer, disease). The problem is at https://github.com/DonatoMeoli/WS4J/blob/008a427123a9f25106ea81495eb256976658dd1f/src/main/java/edu/uniba/di/lacam/kdde/ws4j/util/WordSimilarityCalculator.java#L123

It's not clear to me why you are checking the maxscore. Would you mind explain it to me, please? :-)

Hi @jairofsouza,

thx for opening this issue. Several years have passed since I wrote this code, honestly, I don't remember exactly why I wrote this.
Anyway, can you make a PR and make sure that the unit and cross-referenced tests with python NLTK are successful?

Thx for the contribution.