Clustering Systems

This repository contains implementations of K-Means Clustering and Hierarchical Clustering.

The current hierarchical clustering algorithms uses agglomeration to create the clusters and uses the following linkages:

The agglomerative hierarchical clustering script is divided into two classes:

These are further divided as:

matrix_min(): Returns the current minimum value in the passed matrix.
min_cluster_distance(): Returns the minimum distance between clusters.
max_cluster_distance(): Returns the maximum distance between clusters.
avg_cluster_distance(): Returns the average distance between clusters.
matrix_gen(): Generates a new proximity matrix after cluster formation.
clustering(): Clusters points agglomeratively and returns the linkage matrix.
distance(): Calculates distance between points.
raw_matrix(): Generates the proximity matrix for the first time from data.

The algorithms were run on a dataset consisting of amino acid sequences. The results are published as dendrograms:

K-Means Clustering

Hierarchical clustering:

![Aditya Srikanth] ![Prateek Das Gupta]

aditya-srikanth/Clustering-Systems