/frequent-itemsets

Apriori and a few other of it's implementations

Primary LanguagePython

frequent-itemsets

A Python implementation of the Apriori/PCY algorithm. Works with Python 3.6 and 3.7.

The apriori algorithm uncovers hidden structures in categorical data. The classical example is a database containing purchases from a supermarket. Every purchase has a number of items associated with it. We would like to uncover association rules such as {bread, eggs} -> {bacon} from the data. This is the goal of association rule learning, and the Apriori algorithm is arguably the most famous algorithm for this problem.

This repository contains five python scripts. It uses the retail dataset from: (http://fimi.ua.ac.be/data/retail.dat). The dependencies for these scripts is matplotlib and numpy. Each implementation runs the algorithm and graphs it after.

files

  • The first is apriori.py. This is an implementation of the apriori algortihm.
  • The second is pcy.py. This is an implementation of the PCY algorithm.
  • The third is SON.py. This is an implementation of the SON algorithm.
  • The fourth is RS.py. This is an implementation of the Random Sampling version of Apriori.
  • The fifth is graph.py. Which just runs all the implementations and graphs them on a single chart.