Comparing the performance of two frequent itemset mining algorithms, eclat and fp-growth on 6 datasets:
- retail and accident (source)
- groceries (source)
- bats (source)
- abalone (source)
- house (source)
- adult (source)
The study aims to identify the key characteristics of the datasets affecting the performance of the two algorithms. The report presents a summary of the key findings along with supporting figures.
The repository includes:
- report.pdf: the report presenting key results and corresponding discussion
- /code: the code directory, including:
- /datasets: a copy of the dataset files used in the study.
- /output: directory used to save the output of the two miners and related figures.
- helper.py: includes helper functions for running the experiments.
- miner.py: helper code for running the two mining algorithms, eclat and fp-growth.
- main.py: main file to run the miners and generate figures.
You can clone the repository and run the file 'main.py' to re-execute the experiments. You can use the report as a reference for interpreting the results.