/ID2222-Data-Mining

KTH ID2222 Data Mining

Primary LanguageRoff

ID2222-Data-Mining

Solutions for the course ID2222 Data Mining at KTH. This course deals with Data Mining techniques for analysing large-scale datasets. For more information please refer to the course webpage. The homework solutions were mostly implemented in Python.

Homework description

  • homework 1 - Similar Items: Find similar documents using minhashing and LSH techniques
  • homework 2 - Association Rules: Find frequent itemsets and association rules using the Apriori algorithm
  • homework 3 - Data Streams: Estimate triangle counts in a streaming graph of edge insertions using TRIEST
  • homework 4 - Graph Spectra: Implementation of the spectral graph clustering algorithm described in this paper
  • homework 5 - Graph Partitioning: Implementation of the JABEJA algorithm for K-way graph partitioning in a distributed environment