/DataMiningFinalProject

Repository containing the final project of the methods of data mining course of the EIT Digital data science master at Aalto University 📚.

Primary LanguageJupyter Notebook

Data mining final project

Repository containing the final project of the methods of data mining course of the EIT Digital data science master at Aalto University 📚.

Aalto

Objective

The objective of this project is to test different clustering algorithms applied to two datasets:

  • genedata: contains the codification of a genetic sequence.
  • msdata: contains the results of an mass spectrometry analysis.

The goodness of the clustering algorithm is tested using the normalized mutual information score (NMI).

Project structure

The code for the different clustering test is performed in different python scripts:

  • src/genedata: analysis done for the first dataset.
  • src/msdata: analysis done for the second dataset.

Authors