FZJ-IEK3-VSA/tsam

Allow deterministic k-means/k-medoids

Closed this issue · 4 comments

Allow to set random_state parameter of scikit's cluster methods to make TSA reproducible and deterministic (e.g., random_state=0)

@maximilian-hoffmann Isn't the predefinition of the seed already implemented?

if clusterMethod == "k_means":
from sklearn.cluster import KMeans
k_means = KMeans(n_clusters=n_clusters, max_iter=1000, n_init=n_iter, tol=1e-4)
(l. 63, tsam/tsam/periodAggregation.py)
I can't seem to find it in the source code, and I remember that you highlighted the deterministic behavior of hierarchical clustering in the paper as something unique to this clustering method, though, kmeans et al. allow for that as well with a set seed

Of course, you can set the seed and the k-means algorithm will become reproduce-able. Still, in its origin, it is dependent on a randomized placement of starting points which is not the case for the hierarchical aggregation.

I would be quite happy if one would implement the definition of the seed as argument and make a pull request. :)

Since it does not seem crucial, I will close this issue. In case someone wants to save the seeds and reuse them, feel free to open this issue again.