Grape

Grape is a collection of document clustering algorithms written in Scala. It avails from Apache OpenNLP to extract specific feature from each document and build the final vector space that is used in different approaches. Grape contains the following algorithms (at the moment):

KMean Clustering
Hierarchical Agglomerative Clustering
Buckshot Clustering

How to use

An example how to use KMean clustering on your documents:

import com.jayway.textmining.{NLPFeatureSelection, Cluster, KMeanCluster}

// number of clusters
val k = ...

// A document is a pair of (Document ID, Document Content). ID can be anything.
val docs: List[(String, String)] = ...

val kMeanCluster = new KMeanCluster(docs, k) with NLPFeatureSelection
val clusters:List[Cluster] = kMeanCluster.doCluster()

License

Distributed under the Apache Software License.

amir343/grape

Grape

How to use

License