Frequent Itemset Mining: eclat vs fp-growth

Overview

Comparing the performance of two frequent itemset mining algorithms, eclat and fp-growth on 6 datasets:

retail and accident (source)
groceries (source)
bats (source)
abalone (source)
house (source)
adult (source)

The study aims to identify the key characteristics of the datasets affecting the performance of the two algorithms. The report presents a summary of the key findings along with supporting figures.

What's included

The repository includes:

report.pdf: the report presenting key results and corresponding discussion
/code: the code directory, including:
- /datasets: a copy of the dataset files used in the study.
- /output: directory used to save the output of the two miners and related figures.
- helper.py: includes helper functions for running the experiments.
- miner.py: helper code for running the two mining algorithms, eclat and fp-growth.
- main.py: main file to run the miners and generate figures.

You can clone the repository and run the file 'main.py' to re-execute the experiments. You can use the report as a reference for interpreting the results.

NajwaLaabid/Frequent-Itemset-Mining

Frequent Itemset Mining: eclat vs fp-growth

Overview

What's included