/pysparcl

Python implementation of the sparse clustering methods

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

pysparcl

Python implementation of the sparse clustering methods of Witten and Tibshirani (2010).

Demo results

Each sample has 1000 features, and 1 % of them are informative.

Hierarchical clustering Sparse hierarchical clustering

Functions

  • Sparse hierarchical clustering
  • Sparse KMeans clustering
  • Selection of turning parameter for sparse hierarchical clustering
  • Selection of turning parameter for sparse KMeans clustering

Installation

Getting pysparcl

git clone https://github.com/tsurumeso/pysparcl.git

Run setup script

cd pysparcl
python setup.py install

Run demo

Perform sparse hierarchical clustering.

cd demo
python run.py

Perform sparse KMeans clustering.

cd demo
python run.py -m kmeans

Usage

import matplotlib.pyplot as plt
import pysparcl

from scipy.cluster.hierarchy import dendrogram
from scipy.cluster.hierarchy import linkage


# X is a numpy array of (samples, features) shape.
perm = pysparcl.hierarchy.permute(X)
result = pysparcl.hierarchy.pdist(X, wbound=perm['bestw'])
link = linkage(result['u'], method='average')
dendro = dendrogram(link)
plt.show()

References