/wkmeans

k-means algorithm with (optional) instance weights.

Primary LanguageCMIT LicenseMIT

WKMEANS		            Copyright (c) 2012, Deniz Yuret

Usage: wkmeans [options] < input > output
  -k number of clusters (default 2)
  -r number of restarts (default 0)
  -s random seed
  -l input file contains labels
  -w input file contains instance weights
  -v verbose output
  
Input format (assuming you have m vectors of n dimensions):
[label_1] [weight_1] x_11 ... x_1m
...
[label_m] [weight_m] x_m1 ... x_mn

label_i  : (string) label of the ith vector, required when -l used
weight_i : (double) weight of the ith vector, required when -w used
x_ij     : (double) ith vector's jth component

Output format:
[label_1] c_1
...
[label_m] c_m

c_i : (int) cluster of ith vector


Algorithm: wkmeans is a k-means algorithm with (optional) instance
weights.

* Based on mpi_kmeans-1.5 by Peter Gehler.

* Based on C. Elkan. Using the triangle inequality to accelerate
  kMeans. ICML 2003.

* Initialization based on Arthur, D. and Vassilvitskii,
  S. (2007). K-means++: the advantages of careful seeding.
  Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete
  algorithms. pp. 1027-1035.

Please see the file LICENSE for terms of use.  Everything is standard
C, so just typing make should give you an executable.