Clustream WithKmeans: null center and (radius, weight = 0) for BOTH micro and macro clusters
onofricamila opened this issue · 0 comments
First of all: I am using moa-release-2019.05.0-bin/moa-release-2019.05.0/lib/moa.jar
(obtained from https://moa.cms.waikato.ac.nz/downloads/).
Now, let's go to the point: I am trying to use moa.clusterers.clustream.WithKmeans stream clustering algorithm and I have no idea why this is happening ...
- My code:
import com.yahoo.labs.samoa.instances.DenseInstance;
import moa.cluster.Clustering;
import moa.clusterers.clustream.WithKmeans;
public class TestingClustream {
static DenseInstance randomInstance(int size) {
DenseInstance instance = new DenseInstance(size);
for (int idx = 0; idx < size; idx++) {
instance.setValue(idx, Math.random());
}
return instance;
}
public static void main(String[] args) {
WithKmeans wkm = new WithKmeans();
wkm.kOption.setValue(5);
wkm.maxNumKernelsOption.setValue(300);
wkm.resetLearningImpl();
for (int i = 0; i < 10000; i++) {
wkm.trainOnInstanceImpl(randomInstance(2));
}
Clustering clusteringResult = wkm.getClusteringResult();
Clustering microClusteringResult = wkm.getMicroClusteringResult();
}
}
- Info from the debugger:
I have read the source code many times, and it seems to me that I am using the correct functions, in the correct order ... I do not know what I am missing ... any feedback is welcomed!
EDIT:
Thanks to Anony-Mousse on Stackoverflow, I noticed the fields are unused, likely coming from some parent class with a different purpose. Using the getter methods such as getCenter()
, getWeight()
, and getRadius()
, I could get the values.
Now, are that values I got "reliable"?
Moreover, what is the purporse of the weight field? It seemed to me that it represented the number of 'elements' each cluster has, but sometimes I get a real number ... If the weights are integer, the micro clusters ones does not sum up to the total number of samples, and the macro clusters ones does not sum up to the number of micro clusters .... thanks in advance!