Supercluster

A .NET machine learning focusing on clustering and metric learning. Currently I am just starting out, but I plan to focus on algorithms that are new (such as the ENN algorithm) as well as old classics (such as DBSCAN). The project is under heavy development and I have many things planed.

Philosophy

I am the only developer on the project so I can do it my way.

Research first: All algorithms are self-written. I demand a complete and deep understanding before I start coding. This involves reading many research papers. Many cups of coffee.
Correctness: I am a mathematician by training. Machine Learning is just mathematics. Mathematics must be correct, hence all algorithms are rigorously unit tested.
Documentation: I strive for super-thorough documentation. Admittedly at this stage it is hard to keep up. But after release 1.1. I will re-document the code base.
Quality: Thorough commenting, style-cop compliance, and following good software design principles. Mathematicians have been writing bad code for too long. This stops now. Quality is as important as correctness.
Appropriate Efficiency: Code is optimized where it needs to be optimized. Let's be honest. This is C#. I can't compete with C++, but more and more people are using high-level languages for data-science. When an optimization can drastically improve performance (e.g., using k-d-trees, multi-threading) I optimize. When optimization complicates the code for a 1% speed increase, I don't.

###Algorithms Here is a list of currently supported algorithms:

k-NN
ENN (1st C# implementation)
k-Means

Next Release

Algorithms

Data Structures

~~A KD-Tree implementation.~~ Done: https://github.com/eregina92/Supercluster.KDTree
~~An M-Tree implemention.~~ (1st C# implementation (That I could find.))

Code

Refactor of the ENN code.
Updating kNN, kMeans and ENN to accept a Tree during construction