/DataMining1

IBM HR Analytics Employee Attrition and Performance

Primary LanguageJupyter Notebook

Data Mining 1: Fundations

In collaboration with Mario Bianchi

Analysis of the dataset IBM HR Analytics Employee Attrition & Performance

Tasks of the project:

  1. Data Understanding: Explore the dataset with the analytical tools studied and write a concise “data understanding” report describing data semantics, assessing data quality, the distribution of the variables and the pairwise correlations.
  2. Clustering analysis: Explore the dataset using various clustering techniques. Carefully describe your's decisions for each algorithm and which are the advantages provided by the different approaches.
  3. Classification: Explore the dataset using classification trees. Use them to predict the target variable.
  4. Association Rules: Explore the dataset using frequent pattern mining and association rules extraction. Then use them to predict a variable either for replacing missing values or to predict target variable.