andreasjansson/parallel-frequent-itemset-mining-lastfm360k

Some questions

Opened this issue · 1 comments

Hi Andreas,
Thank you very much for the code and explanation of parallel fp-growth, but I don’t understand the role of sharding. Does it play a role in the efficiency of parallel computing? Looking forward to your reply.

Hi @liuyn0505! I no longer maintain this repo, I wrote the code 8 years ago for a blog post but haven't touched it since so it probably doesn't work anymore. However, if you find it useful for something the blog post might be useful to explain more what's going on: https://andreasjansson.github.io/no-headache-hadoop/ (scroll to the headline "Example application").