/DataMining-2

Repository for "Data Mining - Advanced Topics and Applications" projects exam.

Primary LanguageJupyter Notebook

Data Mining 2

alt text

Repository for "Data Mining - Advanced Topics and Applications" project exam.

This project consists in analysing and processing audio signals, employing advanced data mining/machine learning algorithms on the FMA dataset.

This analysis is focused on:

  • Imbalanced learning: Random Undersampling, CNN, Tomek's Link, Random Oversampling, SMOTE, K-Means SMOTE, ADASYN
  • Anomaly detection: DBSCAN, KNN, LOF, ABOD, Isolation Forest, Extended Isolation Forest, Autoencoders;
  • Advanced classification methods: Naive Bayes, Rule-based classifiers, Logistic Regression, SVM, Ensembles (Random-Forest, Bagging, Adaboost) and Neural Netoworks (MLP);
  • Time series analysis: Motifs & Anomaly detection, Clustering, Shaplet-based classifiers;
  • Sequential Pattern Mining and Advanced Clustering: X-Means, OPTICS and Transactional Clustering (K-Modes);
  • AI Explaianbility: LIME explainer;

Dataset and additional info are available at: mdeff/fma