/apriori

A playground for market-basket analysis.

Primary LanguagePython

Apriori Algorithm Challenge

Hey y'all!

I originally kicked this off as an open challenge to analyze the datasets included here. (Read the original challenge here)

Since then, I've decided to make the challenge a bit more strucutred. I wrote a blog post on the intuition behind the Apriori Algorithm, where I challenged readers to implement the algorithm themselves.

Here, I'm providing datasets that may be useful as you implement the algorithm. I've also included dataset_builder.py, which you can use to generate very large datasets and test your implementation "at scale". (You'll be running these on your own machine, most likely - so you won't be doing anything at petabyte scale. But hey - it's still fun.)

The hand-drawn mini dataset from the blog post is blog-baskets.txt; baskets.txt and other-baskets.txt are two other datasets that I encourage you to use.


Wanna share your results?

Personally, I think this kind of thing is more fun when you engage with others. If you'd like to share your implementation with me - or give me any feedback you have on this - please reach out! My contact info is below.

And if you want to engage with others with an interest in Data Science - as well as hear about these projects as I'm creating them, join my mailing list. I send out a weekly email with some Data Science related content each week.

Send your findings to me at: dan [at] isaza [dot] dev